Re: Unit compilation (was: Re: Quo vadis,

31 Jul 2010


      Adriaan van Os wrote:
...
Frank Heckenbach wrote:
...
Jonas Maebe wrote:
...
What's taking so long currently, Adriaan, is probably the GPI
imports. That's independent of the target, i.e. a Pascal to C++
converter would have to do it just the same, so its complexity is
independent of how the output is structured. As I said, this is a
separate issue which would be easier to tackle if the compiler was
written in a high-level language (such as C++ or Pascal with
templates). It's easier to find and experiment with efficient data
structures (e.g., hash tables, trees) when they're readily available
than when you have to manually implement them each and every time
like in C (for the current GPC) and also in Pascal so far.
I agree that it is a separate issue and that it is easier to tackle in a future compiler. So, I am 
moving this to a new thread. We need not discuss it any further now, but I am still following up to 
answer your questions.
The problem is not so much the speed of GPI loading as such, but the fact that unit-recompilation 
is of order-2 (in GPC) instead of order-1 (as in FPC).
Imagine a program P that uses unit1 .. unitN, where each unit K uses unit1..unitK-1. Currently, a 
compile of program P with GPC

triggers a compilation of unit1 and writes a unit1.gpi
triggers a compilation of unit2, which uses unit1, so loads unit1.gpi and writes unit2.gpi
triggers a compilation of unit3, which uses unit1 and unit2, so loads unit1.gpi and unit2.gpi and

writes unit3.gpi

etcetera
triggers a compilation of unitK, which uses unit1..unitK-1, so loads unit1.gpi..unitK-1.gpi and

writes unitK.gpi

etcetera

So, unit1 is written once and loaded N-1 times, unit2 is written once and loaded N-2 times, 
etcetera, unitK is written once and loaded N-K times, etcetare. Therefore, N .gpi files are written 
and (N-1) + (N-2) + ... (N-N) = N * (N-1)/2 .gpi file are loaded.
Ah, I see. I thought the issue was that a unit import/export with N
declarations took O(N^2) time. I think we fixed that (or at least in
some cases).
...
In other words, compilation times increase quadratic with the number of used units. Improving .gpi 
load times doesn't help much (only by the square root of the load-time improvement) and the process 
will still be slow when N is large. The only real solution is to make compilation a linear process, 
where already-loaded .gpi files are not loaded again. My understanding is that this is difficult to 
accomplish with the current GPC back-end.
It's not about the backend, but the driver/compiler structure (gp
(or "gpc --automake") calls gpc, and gpc calls gpc1), and it's not
difficult, but impossible in the current structure, because each
unit is compiled by a separate process which has to load all used
interfaces.
...
The same problem exists with C/C++ header files and the common solution there is to use include 
guards http://en.wikipedia.org/wiki/Include_guard.
No, that's not the same problem. The problem there is that the
compilation of a single source file would be O(N^2) if it includes N
headers, and each header in turn includes the previous ones. (And it
would lead to duplicate declarations.)
This problem would *also* occur in Pascal (so the whole compilation
would be O(N^3)), if GPC (like probably all other Pascal compilers)
didn't prevent it by keeping a list of imported interfaces and avoid
duplicate imports.
...
...
BTW, is it the actual Mac OS X interfaces that are so huge, or your
It is the actual Mac OS X interfaces that are so huge,
So if I understand you correctly now, the OS interfaces span so many
units. Now I understand why you suggested to output them all to a
single C++ file (since they probably change rarely, only with
upgrades, and then all at once).
Can't you just put them all in a single unit, or would it become so
large that you couldn't compile it (for memory reasons or so)?
Another idea (which, if possible, works now, and might also be
necessary for a future GPC, if any), is to reduce the number of
dependencies (maybe requiring some restructuring). I find it hard to
believe that really every unit requires all previous ones. Can't
they be grouped in "topics" that are mostly independent of each
other (so the "uses" graph would look more like a tree than a linear
chain; e.g. a roughly binary tree would be O(N log N) which should
be acceptable).
Frank
-- 
Frank Heckenbach, f.heckenbach@fh-soft.de, http://fjf.gnu.de/, 7977168E
GPC To-Do list, latest features, fixed bugs:
http://www.gnu-pascal.de/todo.html
GPC download signing key: ACB3 79B2 7EB2 B7A7 EFDE  D101 CD02 4C9D 0FE0 E5E8

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: Unit compilation (was: Re: Quo vadis,