Prof. A Olowofoyeku (The African Chief) wrote:
- Is it possible to have reduced functionality versions of libgpc
(perhaps produced with a switch when building the compiler?)? If so, is it possible to choose which features shall be built into it? (e.g., via a simple text configuration file, of the kind you have when building the linux kernel, or busybox, etc.).
If so, then configure options might be the obvious way to go.
What I mean is something like (please note, this is off the top of my head, and is not properly thought through - it may even be impossible or the necessary features may not be in libgpc at all, but rather in the compiler):
They're in both of them which doesn't help. I.e., if you built an RTS without string support, some operations that don't look like RTS calls (e.g. "+" for strings) would lead to undefined linker references.
One could add explicit checks in the compiler, but (slightly tangential to the topic) I'm thinking of a different route here: So far, the compiler creates RTS calls based on "magic" linker names ("_p_Set_Union" etc.) and makes implicit declarations for them which have to match the RTS declarations. Now we could instead properly import the RTS declarations from GPI files. (Previously, due to lack of qualified identifiers and selective import, this would have created namespace conflicts, but now that these features are available, it can be done.) The compiler would then call the RTS based on (still "magic") Pascal-level declarations, so when a version of the RTS omits them, the compiler will simply notice the absence of those declarations in the RTS GPI files and could emit somewhat clearer errors.
There would be some strange effects, though. E.g., comparing two strings requires an RTS call, but comparing one string against '' does not because it can be optimized to a comparison of its length against 0. Removing the respective RTS routine would mean that the latter would still work, but the former wouldn't. Of course, one could explicitly forbid the former as well when the RTS routine isn't found (not sure if rather useful or annoying).
# enable support for pascal strings STRINGS=y
# Pascal file I/O FILES=y
The problem is that parts of them depend on each other. E.g., most file, some string, and many more routines can generate runtime errors. Runtime error handling uses strings ... and files ... etc. ... So omitting either strings or files would be difficult, and removing both of them would mean replacing the runtime error management with a version that doen't use (Pascal) strings and files. So you quickly arrive at a very bare-bones RTS (which some people use for special cases, indeed, but e.g. the CGI unit would not find easy to use -- e.g., it obviously uses strings quite a lot, as well as files, for POST uploads, output buffering, runtime error mailing, etc.).
One idea I have in mind WRT the RTS is to reorganize its units, so we'd have a (clearly visible) core of routines that are interrelated and provide the basic support, and put this in the lowest-level RTS unit. This would include runtime errors, and the necessary amount of string and file capabilities to support them, whereas e.g. additional string and file features not strictly needed here would be one level higher.
In particular, this should get rid of cyclic dependencies in the RTS. Currently there are a few explicit ones, but many more implicit ones, via magic compiler calls. E.g., an RTS routine does a file operation, and the compiler translates it to a call of an RTS file routine that it just assumes exists, although it's in a unit that will be compiled later and probably uses the current unit. By implementing my above plan (Pascal-level imports), such dependencies would become visible, in this example forcing the RTS unit to import the respective RTS file routines, and thus (at first) create a lot of cyclic dependencies in the RTS. By resolving them manually (by reorganizing the RTS units), we'd get close to the unit structure I described.
In such a setting, one could ideally omit whole RTS units that are not needed. (But, of course, declaration-level smart-linking would still give somewhat besser results, so I still have it on my list ...)
- What things can be done in GPC without libgpc - for example, if
one produced an include file of libc exports and doesn't use units or Pascal strings or objects or file I/O at all?
Basically yes, though there isn't an "official" list of which internals require RTS calls (and this might change slightly over time), so one can only try, looking at the linker errors (undefined references).
There are a few routines that always must be provided as they're called by automatic initialization etc. This is the list I used in a small standalone project last year. (You can omit the range check error stuff if you disable range checking, of course, OTOH you might need other runtime errors when linker errors tell you so.) The RTS version (here, 20050331) has to be matched to the version of the RTS replaced, and the list of declarations and their parameters may change slightly with new GPC versions, so the code isn't exactly maintenance-free (the main reason for the requirement of the RTS version check).
var VersionCheck: Integer; attribute (name = '_p_GPC_RTS_VERSION_20050331');
procedure Initialize (ArgumentCount: CInteger; Arguments, StartEnvironment: PCStrings; Options: CInteger); attribute (name = '_p_initialize'); begin end;
procedure DoInitProc; attribute (name = '_p_DoInitProc'); begin end;
procedure Finalize; attribute (name = '_p_finalize'); begin end;
procedure CExit (Status: CInteger); external name 'exit';
procedure RangeCheckError; attribute (name = '_p_RangeCheckError'); begin CExit (42) end;
- This follows from #2 - how can one write a different libgpc? Is
there a special thing that has to be done to make it work (i.e., how is it different from any bog-standard .a or .so file?).
I hope the above answers this. For the most part, it isn't very special, except for the explicit linker names. (But when we change it as described, any RTS replacement also needs changing, of course, e.g. using magic Pascal names then. Also, the parameters of some RTS routines change occasionally, in accordance with compiler calling changes.)
As you can see, the C part of the RTS is quite small now (one file (rts.c, plus interfaces in rtsc.pas), and not fundamentally different from C code and interfaces called by other Pascal units (except that it uses more C headers and does many more portability conditionals, mostly using autoconf settings, than typically otherwise, due to its purpose of interfacing to different libc's).
The library building part (.a or .so) is nothing special in the RTS. You could link the list of .o files instead (manually) if you wanted.
gpc.pas is a bit "magical" in that it's generated by a script from the interfaces of the other units, excluding parts enclosed in "{@internal}" .. "{@endinternal}" comments. (These are just the parts with the magic linker names, more precisely those which are meant to be called only by the compiler, not from user code directly via gpc.pas. With my suggestion above, this would change, and the "{@internal}" comments probably disappear. gpc.pas could then probably switch to proper re-exporting instead of being script-generated.)
Another special thing is the units' name-attributes which are just there to avoid namespace conflicts with user-units of the same name (which are perfectly valid, of course, so they must not break).
You have to be careful of unintended recursion due to internal compiler calls. E.g. doing file I/O from a routine to implement file I/O is a recursion though it doesn't look like one ordinarily -- it might be OK if it's to a different file, and your routines (with according data structure) are reentrant, but in general you have to be careful there.
Also, during the initialization and finalization of the RTS, RTS services may not be available as expected, so you have to be very careful of the order of doing things. E.g., obviously before the memory manager is initialized, dynamic memory allocation won't work; this includes all uses of New and GetMem, of course, and also RTS routines that do them (which are under your control then, of course; e.g. in the current RTS some file routines).
Initialization is started via _p_initialize (see above) which has to call the RTS units' intializers (as needed). In the default RTS that's the strange "InitInit" call in init.pas which calls the implicit initializer of init.pas, which in turn calls all the other initializers automatically (in the regular way) as init.pas uses all the other units.
- Would there be any mileage in producing a libc standard unit?
libc is a rather vague term here. Such a unit could be anything from a non-portable interface of the 6 most important libc calls (open, close, read, write, fork, exec, according to Linus ;-) to a fully portable interface to all known libc's on this planet, with interface to all functions supported by any of them plus emulations/errors where not supported ...
The RTS's rts.c file (plus rtsc.pas interafaces) is somewhat closer to the latter extreme (though there are still many areas of libc not covered yet). It actually makes available many functions in "C style" (e.g. OpenHandle etc., visible in gpc.pas), besides being used in the RTS to implement the higher-level routines. So to some extent this is such a unit already.
I suppose you're more thinking of a rather minimal unit. A problem is that different programmers (and even different projects by the same programmer) will often disagree just how minimal it should be. In the end, a bigger unit may fare better when automatically removing the unused parts. Yes, I know, we need smart linking ...
Frank