Tom Verhoeff wrote:
Frank wrote:
As you probably know, GPC currently converts all identifiers to first letter upper-case/rest lower-case. Since this is often not nice, I plan to change this. This would affect at least the following things:
It is not clear to me what change you propose.
Only what I wrote, i.e. (optional) warnings, error messages, file names (in some certain situations).
You probably mean (because this is easiest to implement) to (optionally?) drop the conversion altogether and consequently have case-sensitive identifiers.
Of course not.
This also seems easy to implement (at the expense of a slightly more expense compare for lookup) and more in line with Pascal:
When creating a symbol-table entry you store the casing as found. When looking up a name you compare names modulo case (e.g. convert both comparands to lower case), when you get a hit you use the actual casing found in the symbol table.
For the technical details: I'll probably store both a canonical casing (for lookups) and the given casing (for messages etc.). Since lookup is not done as a series of compares, but using a hash etc., it's important to have a canonical form there. (Part of this lookup is done in the backend, and would be hard to change for us, and it's also more efficient this way.)
Markus Gerwinski wrote:
- Any ideas for the name of the option? (`-Widentifier-case'?)
Sounds good to me.
(If "on" is the default, the option should IMO get another name. E.g. `-Wignore-identifier-case'.)
`-Wno-identifier-case' then -- that would be quite sure, since most options come in `[no-]' pairs.
- Should it work across units/modules? Or should it be a tri-state option (never/current file/global)?
Tri-state, I suppose. If you yourself care for identifier cases, but use units by someone who doesn't, it would be good to disable the warnings for that units.
Name for the 3rd option? `-Widentifier-case-local'?
One question: What convention will GPC use instead of the old one? Will the asmnames be verbatim as defined? (E.g. if I write "type myFoo", the asmname of that type will be 'myFoo'?)
I intentionally didn't mention asmnames, because I'm not going to change them now. They're related to qualified identifiers, routine overloading, etc., which is a bigger mess of changes (for which I don't have the time now).
Most of the following comments are related to them, so they're not current now, but I'll comment, anyway.
Pierre Muller wrote:
At 10:23 09/01/2003, Frank Heckenbach wrote:
As you probably know, GPC currently converts all identifiers to first letter upper-case/rest lower-case. Since this is often not nice, I plan to change this. This would affect at least the following things:
What is your new rule? (A) -- all lowercase? (B) -- using case in declaration? Here you will face a problem that we (Free Pascal developpers) already faced with our 'cdecl' modifier'. For forward'ed function, which case should be used?
- the case used in the first declaration (with forward)
or the case in the second true declaration of the function.
(C) - other??
My ideas are the following:
From the Pascal programmer's viewpoint, the default asmname is
"anything" (i.e., don't rely on it, and use an explicit `asmname' when you need it in C or so).
A simple `external' directive will convert the Pascal identifier to lower-case. This may be useful for some C functions, but generally I'd expect `external' to be used together with `asmname'. This is possible now, so I suggest to write such code already now, so it can remain unchanged later. (Additionally, the BP or Delphi syntax `external name 'foo'' can be supported then.)
The directives `c' and `c_language' are then obsolete and can be dropped.
-Debugging.... The first char up and the other down is a default combination that I added in the p-exp.y file of GDB sources to get better case-insensitiveness in GDB behavior. (Remember, I am the official pascal language maintainer for GDB).
Indeed, this may be a problem. BTW, what you do you in FPC with overloaded routines, same identifiers in different units etc.? Can they have their Pascal names encoded for gdb, or do you have do to "demangling" in gdb? If the latter is the case, we'll have to define some mangling rules in the future (maybe the same that FPC uses) -- but even then, I recommended Pascal programmers not to rely on them, and to use them only for gdb.
-Interaction with C sources.... Most standard C sources use names with all chars downcase, so if you choose the same behavior you will have overlaps between C and GPC identifiers. This seems to be a very dangerous change, because some GPC specific functions might become the standard function after this.
I'm aware of this. AFAIK, this was the reason why the first-uppercase rule was introduced (long time ago). For now, this will remain (for asmnames), and in the future the mangling will have to be defined such as to avoid such conflicts.
Another point is that, in Free Pascal, we use uppercased names for normal pascal variables and functions and lower case variables for internal functions (like for operator overloading) so we would get into trouble if we would do the same change. But I don't know the GPC internal and can not tell if this problem could also appear inside GPC.
BTW, currently overloaded operators get asmnames like `plus_Integer_Integer' (so they don't conflict with lower-case C functions or Pascal routines), but this can also be changed in the future together with asmnames of routines and variables ...
Wood David wrote:
However, I don't really support compiler changes which force legacy code to be changed, regardless of whether there are new warnings or not. I can live with the change and update the code but it is an irritation to our small community number of code developers.
I understand this. However, the asmname and related issues are one major incompatible change that must happen sometime. I really mean must -- different units/modules can have the same identifiers, which with the current rule get the same asmname and conflict at link time. Since both EP (modules) and BP (units) allow this, we'll have to change the convention. But again, not now ...
Given that most Pascal built-in's such as ReadLn and WriteLn (or writeln or WRITELN!) are redefinable identifiers, are all these variations going to be flagged up as warnings. I see a long road ahead to get back to squeaky-clean code!
I think I can arrange for predefined identifiers (and keywords) not to have an "enforced" spelling (though I'd like to ;-). I.e., it would be ok to always write `WriteLn', or always write `WRITELN' etc., but not mixed.
CBFalconer wrote:
I actually think that your existing convention (1st upper, rest lower) is an excellent way to resolve most things. I am quite sure I don't have a grasp of the conflicting systematic requirements involved. It might be helpful to enumerate them.
As for asmnames, I'm not quite sure yet myself. In particular, which characters can be used in assembler identifiers on all systems (which may be needed for the "mangling"). I think I tried `$' for a special identifier and it failed somewhere. I don't know if `.' or any other character will work everywhere. Otherwise (if only alphanum and `_' works), the mangling might get even more tricky ...
Frank