The African Chief wrote:
But how would you translate references to s[0] in BP then? I know, there shouldn't be any in most programs, but sometimes it can be useful, e.g. for efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
"s[0] := ..." is okay.
But you couldn't do that with long strings since their length isn't stored in [0], but in .length. So there are cases where write access to .length is needed, that's what I meant.
I do that myself sometimes (but only to set the length to zero).
s:='' should do the same. (BP optmizes this to just setting the length to 0, I hope gpc does as well.)
"length(s) := ...."
or "s.length := ..."
- which shouldn't compile (if at all), when using "--borland-pascal".
I think it should (at least with {$X+} or such), because it's the only way to translate some kinds of procedures -- unless one wants to limit such procedures to borland-like short strings.
Does this mean strings will always have an additional trailing #0? Do the string routines currently support this (i.e. automatically add a #0 at the end; but they should not recognize #0 as a terminator, but only rely on the length field)? That would be good, because it simplifies converting a string into a CString a lot (just "CString(@s[1])", right?).
Delphi 2 does this with long strings (I think). If we are going to do it, perhaps it should only be done for long strings as well?
However, what happens when you want to add a string to the end of it? Check this example, and look at the ouput under Borland;
program Fred; var s,s1:string[40]; begin s := 'Fred Smith'+#0; s1 := s + 'is okay'; writeln ( s1 ); {prints "Fred Smith is okay" - where did the space before "okay" come from? I didn't want or put it there - viz; problem with trailing "#0"} end.
As Peter has already pointed out, an implicit #0 would not behave like this. I suppose this would be achieved by having the length not include the #0. So the internal representation of 'foo' would be something like:
Long string: Capacity:...; Length: 3; string: ('f','o','o',#0,...) Short string: (#3,'f','o','o',#0,...)
The question is, should the #0 be added for short strings as well? If Length=Capacity, it's impossible, but otherwise???
BTW: Your example raises the "problem" that if you have a string with one or more #0 in it, and convert it to a CString, the CString will be shorter than expected. But I think there's nothing to be done about it. If one wants to convert a string to a CString, one just must not put any #0 in it...
Peter Gerwinski wrote:
But how would you translate references to s[0] in BP then? I know, there shouldn't be any in most programs, but sometimes it can be useful, e.g. for efficiency. But it's ok for me if it's only enabled with {$X+}, see below.
That's one reason why `StortString's should be implemented - which will have "s[0]". But this job doesn't have high priority for me ...
As I said above, it would (currently) suffice for me if "s[0]:=..." could be translated to "s.CurrentLength:=..." for long strings. But at the moment, it can't be translated at all. The BP syntax "s[0]" is, IMHO, not very good, so it should not be supported (emulated) for long strings.
Sounds good! {$X} is a local switch, isn't it (so one can turn it on and off around code that needs it, and be safe from accidents elsewhere). It would be even better than BP (where accidental writes to s[0] are not even detected by range checking). :-)
As long as GPC has no range checking at all ...
I know...
Currently the length could probably be changed by writing to s[1-SizeOf(Integer)] .. s[0] -- but I really don't want to try that... :-|
BTW: Does the standard require range checking (or does it say anything about it at all)?
"CurrentLength"?
Why not.
What about this: With (*$X+*), addresses of packed array members are only rejected if they don't lie on a byte boundary, and otherwise they work?
And with {$X-} they don't work at all, you mean? Seems ok. And the same for packed records?
Another idea, perhaps even better: let "String ( 255 )" or "LongString ( 255 )" denote an Extended Pascal (long) string, but "String [ 255 ]" or "ShortString [ 255 ]" an UCSD Pascal (short) string? Like this, both Extended Pascal and Borland Pascal programs will just compile. Needless to say that `--extended-pascal' will switch off "String [ 255 ]" and `--borland-pascal' will switch off "String ( 255 )" ...
Isn't it good when non-standard compilers (BP) use non-standard syntax? :-)
BTW: String[n] with n>255 wouldn't work at all then? Seems ok to me.