Frank Heckenbach wrote:
Waldek Hebisch wrote:
As I understand main problems with automake are:
- bugs
- maintenance
IMHO the main remaining bug is that gpc writes to GPI files after compiling interface. That bug affects `gp' too. Both automake and `gp' try to workaround that, but the real fix is to write implementation info in another place (for example a new GPI file).
Why do you think this is a bug, and where do you think gp tries to work around it? (In fact, gp doesn't even look at the contents of GPI files. It looks at the timestamps, but that's more like an additional paranoia check.)
"Language independent" way to decide if file is up to date is to look at its timestamp. Writing to GPI files after compiling interface means that timestamp may change and that breaks dependency analysis -- for example it is impossible to put dependencies in the makefile. In case of gpc itself we have this ugly `stamp-error-gpi' (which already caused multiple build failures) where we should dependencies on `.gpi' files.
As I wrote in another message I implementd `--disable-gpi-extension' option. With that option I can do:
gpc --automake --interface-only *.pas gpc --disable-gpi-extension --implementation-only *.pas gpc *.o
and that works even if there are cyclic imports (and implementations can be anywhere). Without `--disable-gpi-extension' the second line fails. I treat looking at the content of GPI as workaround. More precisely, I treat avoiding recompilation using extra info as a valuable optimisation, but the build should work correctly using only timestamps on files -- otherwise integration into makefiles gets very fragile.
Concerning maintenance I do not think the cost is so big, and one part (updating `gpc' driver) ATM is "paid" for next few months.
IIRC, about half of your recent patch is for the driver (the other half are qualified identifiers and other substantial features), and that's only for gcc-3.3.x->3.4.0. If that's not a big cost, I don't know.
The diff is huge -- probably 70-80% of the patch. But there are almost no really new code here. The main effort was to add back code removed from gcc-3.4.0 (+ conditioning on version). That was not trivial, as there were a few nasty incompatibilities between old and new driver. Nevertheless, it took less then 20% of effort (probably below 10%).
Two additional points: 1) time saved by gcc folks when they introduced incompatibility is probably much smaller then what I spent to correct the problem. Similarly I belive that my time spent on keeping things compatible will save other folks my more than I lost. 2) gcc-3.3.x->3.4.0 changes were large. Part was (better) multilibing support. They also changed directory structure. I hope that changes in the near future will be smaller
And there are additional kludges in module.c (and smaller parts in other source files). You might not have had to work on this code yet, but I had to, many times, and the code is very fragile.
OK, you know that better.
And that's where the real automake bugs are AFAIK. Currently I made it recompile too much rather than too little (which is bad for efficiency, and we don't even talk about checking file contents (MD5 sums as GP does) instead of just timestamps), and still it might miss some recompilations (didn't check exactly, and such situations are very tricky to debug and unnerving -- I had to do this a number of times). The core problem is that is has only local information (about the current module and its dependencies).
I do not see why using only local information is wrong -- of course assuming that all local information is consistent. I wonder if it really is not the same bug -- I say that local info is getting inconsistent and we should correct that. You say that we should not relay on local info. I am affraid that we are repeating what we wrote earlier. But all problems with automake I saw are solved by keeping correct timestamps.
I am somwhat affraid of effort to maintain `gp'. I gave only short look over the sources and I did not give it real trial. However, from your comments I understand that ATM `gp' misses some automake features.
Which ones?
Quick test (I still have to read the info file so mybe there is easy way to do that):
../../tst25/gp-0.54/gp PC='/pom/kompi/gcc/tst26/gpc7-3.4/gcc/xgpc' -B/pom/kompi/gcc/tst26/gpc7-3.4/gcc/ --no-default-paths -c *.pas gp: only one file name allowed on the command line
Intent: compile everything without linking
Also: ../../tst25/gp-0.54/gp PC='/pom/kompi/gcc/tst26/gpc7-3.4/gcc/xgpc' -B/pom/kompi/gcc/tst26/gpc7-3.4/gcc/ --no-default-paths --interface-only -c *.pas gp: `--interface-only' is meant to be used with GPC/GCC directly, not with GP
Intent: compile all interfaces
Actually, things like '*.pas' seem a good way to tell GP: "Scan the sources to find interfaces and implementations, then compile and link what is needed"
In all GP works quie well (as you see I have some nasty wishes).
BTW, since GP is written mostly in Pascal, it should be easier for many people to maintain it when necessary. (At least from the negative comments I've often seen about GPC being written in C.)
I think that GPC being in C is a big plus. Namely, we reuse _a lot_ of work that C folks did. And we can easily find C compiler to bootstrap. I often hear comments "that code is too complicated, I prefer to write from scratch then reuse it". And GCC code is considerd "hard" compared to other codes. I affraid that many fols that say "I can not help -- it is in C" would say "I can not help -- it is too complicated".
GP code should be simpler than typical compiler code, that is a plus. But in some parts C and Pascal have to be in sync. Without automatic checks that may be a problem.
Also, I am somewhat concerned by the `gp' parser (you use flex to implement it, but conceptually it is a parser).
Yes.
In fact, it is not clear that in GNU Pascal one can find names of imported interfaces/units without looking at all of imported interfaces.
I'm not sure exactly what you mean. If a program says `uses foo;' or `import foo;', the name of the imported unit/interface is `foo'. Do you mean the file name (it's `foo.{pas,p,...}' unless `in '...'' is given etc., that's the same for automake)? Or do you mean names of indirectly imported interfaces (they're stored in gpd files etc.)?
Well: program foo; type bar = (a); operator uses(x, y : bar) z : bar; begin z := x end; var import, v : bar; begin import := v uses v; end .
as you see 'uses v;' is not `uses' at all. Probably it is enough to check if there is a semicolon before -- what I am asking is if you have checked all productions of the bison grammar to verify that we will not get something more nasty.
And, of course, GP does look at all of the imported units/modules (either their .gpd files if existing and up to date, or their source code). It just does it in a cleaner (IMHO) way, file by file, rather than recursively intermixed (and often duplicated) as automake.
The problem is that parsing requires to know what a keyword is and many keywords are "weak" -- they may be redefined (also in imported interfaces!). Frank, you probably analysed that, but assumptions needed for `gp' to work should be written up.
Yes. In fact it may fail in some very strange cases (very crude mixtures of dialects, I doubt whether they'd ever occur in practice).
I am affraid that "I doubt whether they'd ever occur in practice" was used when automake was designed. Robust tool should accept _everything_ that `gpc' accepts (it may be slightly more tolerant on malformed inputs, but to limit confusion not too much). Note, that automake by it very nature uses the same parser, so it is immune to such problems.
But this parser is only a small part of gp and easy to fix/rewrite. In fact, I'm thinking about using the real GPC parser -- either as a special mode to GPC itself (comparable to `-E' for the preprocessor which lists dependencies only), but it might be too much effort to "do nothing" in many places; or just taking the bison parser and removing everything unrelated (which will be a lot, including most semantic actions, so a rather simple parser remains). I think that's not a big deal, but before I do anything about it, first I want to integrate the GPC preprocessor because this may influence things.
I wonder at what stage is the preprocessor. Actually, I was going to do the opposite -- integrate enough of the backend into `gpcpp' to get predefines working with 3.3.x and 3.4. I am slightly concerned as you engage in large rewrites, that take time. Maybe it is better to go on with old preprocessor -- IIRC the only problem I saw was ppc backend, but that is solved by the patch I posted.
And some time is needed to asess if changes to GNU Pascal language will not invalidate those assumptions.
Sure, changes that affect the import and export syntax could, but again, that's not a big deal. Adding a few rules to the parser (whether flex based as now, or bison based) is a matter of minutes (maybe less than it takes me to write this mail) and I expect it to happen very rarely.
I am somewhat concerned that we have only tests to verify that parsing works correctly. With old (non-glr) bison parser in GPC we had warranty of uniqueness (+ 1 symbol lookahead) which greatly reduced possiblity for surprising parses. Now we less predictable bison parser and we want to keep it in sync with the parser in GP. It would be nice to have automatic verifcation that both parsers give equivalent result. (One of my pet project is a tool to check uniqueness of the parse for "arbitrary" grammar -- of course unsolvable, but a bunch of tricks may go far enough.)
So IMHO we need few months (or a year) for `gp' to stablise _before_ we can depreciate automake. And I think that we should have a year of depreciation before removal.
We can discuss time scales. I'd like to have automake deprecated in gpc-2.2, but I don't know when gpc-2.2 will be released (maybe next year).
Realistically, our timeframes look similar.