Emil Jerabek wrote:
Actually, if I read the standard correctly, the `Char' type doesn't need to contain anything but the digits.
And '''', per 6.1.9.
Yes -- but it doesn't have to stand for the apostrophe. So it would be allowed, e.g., to let '''' mean the space, while ' ' is not valid (i.e., the space is not a stringĂ‚Âcharacter), to maintain the required one-to-one correspondence.
Moreover, IIUIC, 6.10.* implicitly require that Char contains ' ', '-', '+', '.', and the letters 'a', 'e', 'f', 'l', 'r', 's', 't', 'u' in an implementation-defined case.
Indeed. Which leaves the question if 6.4.2.2 d) 2)/3) apply if only some letters exist. I wouldn't think so. So I propose the following character encoding for the "Really Stupid Pascal Compiler":
0 - 1 f 2 A 3 L 4 s 5 E 6 u 7 r 8 T 9 0 10 1 11 2 ... 18 9 19 (space) 20 . 21 +
This encoding has the obvious advantage that `fALsE' and `TruE' are represented by the characters 1 to 5 and 8 downto 5 respectively. So instead of storing them as string constants, a compiler could construct them using `for' loops internally which should add a lot in the areas of inefficiency and space overhead.
The space is following the last digits, so that (somewhat common) overrun errors when outputting digits manually will less likely result in a visible faults.
The fact that '-' < '0' < '+' is, of course, mathematically a big improvement over ASCII.
Furthermore, the digit '2' is represented by 11 which might be useful for Roman numeral applications.
For character-strings I suggest (see above) '''' to mean the space and ' ' to mean '.' and nothing else. Not allowing too many characters here will ease the work for the compiler, whereas the programmer can use `Chr', anyway.
Anyway, I seriously doubt that 6.4.2.2 d) was intented as "no letters at all is fine", it might be a bug in the standard.
Perhaps they mean that only upper or lower case letters are ok, but they don't seem to say so.
Frank