Jesper Lund wrote:
I have encountered a couple problems with strings in the most recent GPC beta version. Most of the problems are due to missing range checks, e.g. when assigning a string to a shorter string, or passing a string to a procedure with a shorter string as formal parameter.
- The program below illustrates some of these problems:
Yes, this is a known bug. I think it will be solved soon.
- Another facet of the problem occurs when reading strings from a text file
with Read[ln] (F, stringVar).
If you call Readln, the program will read from the file until the next CR/LF, no matter the size (length) of stringVar, and try to copy the entire line to the variables stringVar (which will almost surely crash the program somehow; but not necessarily in the call to Readln. The SIGSEGV exception could come later, and with a stack trace which makes no sense).
Strange! There has always been a check for the maximum length in the read[ln] procedure.
On the other hand, Read (F, stringVar) only reads a maximum of Length(stringVar) characters [as it should].
Even more strange, since Read and Readln are actually the same procedure (or rather function) internally. Readln (F, stringVar) first does exactly the same as Read (F, stringVar), and then a Readln (stringVar).
Perhaps something was broken temporarily while I changed other things in the Read/Write procedures. OTOH, the following program works correctly (i.e. outputs only 5 characters, even if I type more than 5) on my Linux GPC (which has been modified somewhat after 971001) as well as on my DJGPP GPC (which is the original 971001 binary distribution):
program x; var s:string(5); begin readln(s); writeln(s) end.
Perhaps you did some other thing in your program that triggered the first bug? If not, please send me a program and sample input to show this problem.
Frank Heckenbach wrote:
Strange! There has always been a check for the maximum length in the read[ln] procedure.
On the other hand, Read (F, stringVar) only reads a maximum of Length(stringVar) characters [as it should].
Even more strange, since Read and Readln are actually the same procedure (or rather function) internally. Readln (F, stringVar) first does exactly the same as Read (F, stringVar), and then a Readln (stringVar).
Perhaps you did some other thing in your program that triggered the first bug? If not, please send me a program and sample input to show this problem.
MY MISTAKE! Read[ln] work correctly and never exceeds the string capacity, and my problems are totally UNrelated to the missing range checks in some string expressions that I mentioned in the original post.
The strange crash in my program was indeed caused by writing to memory beyond the string capacity (that is, overwriting something else), but I was jumping to conclusions in blaming Readln. The real culprit was my own buggy code (specifically, a procedure that added a #0 to the string ...).
ADVICE TO GPC USERS:
If you add #0 to a string (in order to call functions in the C library), it is a *very* good idea to use strings whose capacity (max length) are not divisible by 4. In that case, at least one byte is "wasted" to achieve 4-byte alignment, and adding #0 should not overwrite other variables. Alternatively, you should explicitly check that the entire capacity is not used before adding the #0 char, like the procedure below which safely adds #0, and returns a Cstring variable (pointer) to the string.
Regards,
Jesper Lund
function Get_Cstring (var s : string) : Cstring; var p : Cstring; sLen : Integer;
begin sLen := Length (s);
if (slen = s.Capacity) then { We are about to crash .... } begin Writeln ('Fatal error in Get_Cstring: string too short'); Halt; end; { if }
p := Cstring (@s[1]); { Generates same code as: p := s } p[sLen] := #0; { Note: This overwrites s[sLen+1] with #0 }
Get_Cstring := p; { Return pointer to first element } end; { Get_Cstring }