spir wrote (by PM, but I supposed it was meant for the list):
Yes, it's possible, but it's IMHO not conformtable. I know I'm repeating myself, but I'm really fed up with having to "reimplement the list", as I wrote. It's also a psychological thing IMHO -- if advanced data structures (objects, trees, hash tables) are readily available (as in C++), you tend you use them when useful; if they're not (as in Pascal, except for objects), you're much more likely to do quick ad-hoc work-arounds, which affect the efficiency, but also the design of your program, especially in the long run. GCC's TREE_NODEs are a particularly bad example of this, spreading all over the code, but also in my Pascal programs, I often find a lot of ad-hoc list (or other structures) handling code that obscures the actual purpose of my code and thus make it harder to understand later.
This very clearly shows how much a (fully flexible and generic) sequence type is a fondamental feature for any general-purpose language. Thus, it must be part of new-GPC (meaning also prerequisites such a generic & referenced types); then, we have them available to write the new-GPC compiler in new-GPC ;-)
I'd say not only simple lists (sequences), also hash tables, vectors (dynamic arrays) and trees are data structures I use regularly and would like to have readily available.
I take the opportunity to ask a side-question: is there a logical or practical link between compiler target and implementation languages? For instance, provided GPC's feature set is good enough (see above) to write a compiler comfortably,
At the moment it isn't yet WRT the above features.
That's the chicken and egg problem I referred to. Either one uses C++ initially (which has the features) to implement a new compiler with the features and then ports the compiler to Pascal, or one uses the existing GPC to write the new compiler in a more awkward way and later cleans it up when it's self-hosting.
Kevan Hashemi wrote:
I guarantee you that's the case for me. I read through Bison and I liked it very much, and I am thoroughly convinced of its merit by your ^C example. But my first look through the Bison manual page suggested that the output was a C program that we would then compile and link to, which means that at least part of our Pascal compiler would be written in C.
The Bison output basically consists of generated tables (which are mostly numbers), and a fixed parsing algorithm that uses these tables which is more or less verbatim copied from a template file. It provides several such templates, the basic C algorithm, a GLR algorithm (that parses a larger class of syntax, which GPC requires) also written in C, and one in C++.
So it's possible to add a Pascal template as well. Of course, it would have to be written (probably by us), which is some work -- though probably minor compared to the rest of the compiler, and of course, a one-time job, independent of grammar changes, and it would be useful to Pascal programmers who'd like to implement their own parsers.
Given the extent of my inexperience, but at the same time my eagerness to help, perhaps you could explain to me how we could use Bison on the way to a self-compiling Pascal, or how the C-code output of Bison (if that's indeed what it puts out) could be part of the project in a way that left us independent of the GCC back-end problems.
It's nothing to do with backend problems. A newly written compiler implemented in C(++) with a non-GCC target (whether C++, Ada or LLVM) could still use Bison as is.
Frank
Frank Heckenbach wrote:
I'd say not only simple lists (sequences), also hash tables, vectors (dynamic arrays) and trees are data structures I use regularly and would like to have readily available.
I suggest differential binary trees http://knol.google.com/k/differential-binary-trees?hl=en# to be part of it too.
Regards,
Adriaan van Os