Hi, all
I'm having trouble with list management again.
The code snipper below is part of a much longer program. It crashes apparently at random, sometimes after looping several thousand times. A simplified version ran "forever", so I know the basic strategy is fine. The crash occurs upon "dispose"ing a record in list Root, and gives sometimes a Page Fault, sometimes a General Protection Fault. Examining the record being disposed shows no pattern, such as being at the end of the list, near the edge of the lattice, etc.
The object of the procedure is to define pathways through the 3-D maze defined in array Latt. Latt codes for connections in the positive and negative x, y, and z directions, and whether a given point has been tested yet (via Occ). In each pass through the list Root, the record location is examined for connected neighbors. The record is then disposed, and connected neighbors are inserted into the list "LNext". Then LNext is given to Root for the next pass. All insertions and deletions are therefore at the first position in a list.
It's probably an obvious problem, and I'm hoping fresh eyes can see it.
thanks, Toby
---------------------------------
Procedure Advance(var Root : PRec; {rx, ry, rz: medcard; next: PRec} var Latt : PPLary; (3D lattice in schema format} var here, count : medCard); {Local to NumLatt. 1. Identify adjacent, accessible sites. 2. if sites are unoccupied, insert them into LNext. 3. Delete Self from Root. 4. Set Root = Lnext.} var LNext, PSelf : PRec; lx, ly, lz, {neighboring locations} dir : medCard; {directions} Conduct : boolean; {=accessible site}
begin LNext := Nil; while (Root <> Nil) do begin PSelf := Root; with PSelf^ do for dir := 1 to 6 do begin {check all 6 directions} lX := rX; lY := rY; lZ := rZ; case dir of 1 : if (rX > 1) then begin {-X direction} lX := rX-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xn) = Xn); end else Conduct := False; 2 : begin {-Y direction} if (rY = 1) then lY := ny else lY := rY-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yn) = Yn); end; 3 : begin {-Z direction} if (rZ = 1) then lZ := nz else lZ := rZ-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zn) = Zn); end; 4 : begin {+Z direction} if (rZ = nz) then lZ := 1 else lZ := rZ+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zp) = Zp); end; 5 : begin {+Y direction} if (rY = ny) then lY := 1 else lY := rY+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yp) = Yp); end; 6 : if (rX >= nx) then Conduct := False else begin{+X direction} lX := rX+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xp) = Xp); end; end; {case}
if (Conduct and (Latt^[lX]^[lY,lZ] < Occ)) then begin {unoccupied site} inc(count); {count newly accessible} LInsert(LNext, lX, lY, lZ); {insert into LNext} Latt^[lX]^[lY,lZ] := Latt^[lX]^[lY,lZ] OR Occ;{mark as occupied} if ((not Infinite) and (lX = nx) {check for breakthrough} and ((Latt^[lX]^[lY,lZ] AND Xp) = Xp)) then begin Infinite := True; MinPath := here; end; end; {if Conduct} end; {for dir} Root := Root^.next; {advance through list} write('a'); dispose(PSelf); { <<=== program crashes here } write('b. '); end; Root := LNext; end; {Advance}
Toby Ewing wrote:
I'm having trouble with list management again.
The code snipper below is part of a much longer program. It crashes apparently at random, sometimes after looping several thousand times. A simplified version ran "forever", so I know the basic strategy is fine. The crash occurs upon "dispose"ing a record in list Root, and gives sometimes a Page Fault, sometimes a General Protection Fault. Examining the record being disposed shows no pattern, such as being at the end of the list, near the edge of the lattice, etc.
... snip ...
It's probably an obvious problem, and I'm hoping fresh eyes can see it.
All I have done below is reformat, so I could follow it, and separate out the mundane operations into a local procedure, so that the actual list manipulation stands alone, and also because I have no idea what that piece of code is doing. Many variables should be moved into the local, and the item to manipulate should be passed into it, but that is another matter. Now go to the end ....
Procedure Advance(VAR Root : PRec; {rx, ry, rz: medcard; next: PRec} VAR Latt : PPLary; (3D lattice in schema format} VAR here, count : medCard); {Local to NumLatt.
- IdentIFy adjacent, accessible sites.
- if sites are unoccupied, insert them into LNext.
- Delete Self from Root.
- Set Root = Lnext.}
VAR LNext, PSelf : PRec; lx, ly, lz, {neighboring locations} dir : medCard; {directions} Conduct : boolean; {=accessible site}
(* 2--------------------2 *)
PROCEDURE operate;
BEGIN (* operate *) WITH PSelf^ DO FOR dir := 1 to 6 DO BEGIN {check all 6 directions} lX := rX; lY := rY; lZ := rZ; CASE dir of 1 : IF (rX > 1) THEN BEGIN {-X direction} lX := rX-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xn) = Xn); END ELSE Conduct := False; 2 : BEGIN {-Y direction} IF (rY = 1) THEN lY := ny ELSE lY := rY-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yn) = Yn); END; 3 : BEGIN {-Z direction} IF (rZ = 1) THEN lZ := nz ELSE lZ := rZ-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zn) = Zn); END; 4 : BEGIN {+Z direction} IF (rZ = nz) THEN lZ := 1 ELSE lZ := rZ+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zp) = Zp); END; 5 : BEGIN {+Y direction} IF (rY = ny) THEN lY := 1 ELSE lY := rY+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yp) = Yp); END; 6 : IF (rX >= nx) THEN Conduct := False ELSE BEGIN {+X direction} lX := rX+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xp) = Xp); END; END; {CASE} IF (Conduct AND (Latt^[lX]^[lY,lZ] < Occ)) THEN BEGIN {unoccupied site} {count newly accessible} inc(count); {insert into LNext} LInsert(LNext, lX, lY, lZ); {mark as occupied} Latt^[lX]^[lY,lZ] := Latt^[lX]^[lY,lZ] OR Occ; {check FOR breakthrough} IF ( (not Infinite) AND (lX = nx) AND ((Latt^[lX]^[lY,lZ] AND Xp) = Xp) ) THEN BEGIN Infinite := True; MinPath := here; END; END; {IF Conduct} END; {FOR dir} (* with *) END; (* operate *)
(* 2--------------------2 *)
BEGIN (* advance *) LNext := Nil; WHILE (Root <> Nil) DO BEGIN PSelf := Root; operate; Root := Root^.next; {advance through list} write('a'); dispose(PSelf); { <<=== program crashes here } write('b. '); END; (* while *) Root := LNext; END; {Advance}
The above makes it pretty obvious that the problem is NOT in the operate action, but in the list itself. The fact that dispose crashes shows that PSelf is not a valid pointer to a record, which includes *the fact that the record was created with new*
I think you should look into the areas where the list is created. You may have a global instance of an element, for example, that you are incorporating into the list somehow.
GPC new/dispose is dependant on the C malloc, I believe. If you are using DJGPP you can probably use my nmalloc, available on my site, which provides some hooks for examining the allocation chains and also detects most illegal frees. If it is linked ahead of the runtime library it will replace the library malloc stuff. But that should be only a last resort.
Hi, all
CBFalconer wrote:
Toby Ewing wrote:
I'm having trouble with list management again.
... snip ...
It's probably an obvious problem, and I'm hoping fresh eyes can see it.
All I have done below is reformat, so I could follow it, and separate out the mundane operations into a local procedure, so that the actual list manipulation stands alone, and also because I have no idea what that piece of code is doing. Many variables should be moved into the local, and the item to manipulate should be passed into it, but that is another matter.
It's a trade-off between clarity and speed. The loop will execute several billions of times, and the run time is several days. I try not to call functions or procedures from inner loops. But as you say, that's a whole other discussion.
Now go to the end
Procedure Advance(VAR Root : PRec; {rx, ry, rz: medcard; next: PRec} VAR Latt : PPLary; (3D lattice in schema format} VAR here, count : medCard); {Local to NumLatt.
- Identify adjacent, accessible sites.
- if sites are unoccupied, insert them into LNext.
- Delete Self from Root.
- Set Root = Lnext.}
VAR LNext, PSelf : PRec; lx, ly, lz, {neighboring locations} dir : medCard; {directions} Conduct : boolean; {=accessible site}
(* 2--------------------2 *)
PROCEDURE operate;
BEGIN (* operate *) WITH PSelf^ DO FOR dir := 1 to 6 DO BEGIN {check all 6 directions} lX := rX; lY := rY; lZ := rZ; CASE dir of 1 : IF (rX > 1) THEN BEGIN {-X direction} lX := rX-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xn) = Xn); END ELSE Conduct := False; 2 : BEGIN {-Y direction} IF (rY = 1) THEN lY := ny ELSE lY := rY-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yn) = Yn); END; 3 : BEGIN {-Z direction} IF (rZ = 1) THEN lZ := nz ELSE lZ := rZ-1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zn) = Zn); END; 4 : BEGIN {+Z direction} IF (rZ = nz) THEN lZ := 1 ELSE lZ := rZ+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Zp) = Zp); END; 5 : BEGIN {+Y direction} IF (rY = ny) THEN lY := 1 ELSE lY := rY+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Yp) = Yp); END; 6 : IF (rX >= nx) THEN Conduct := False ELSE BEGIN {+X direction} lX := rX+1; Conduct := ((Latt^[rX]^[rY,rZ] AND Xp) = Xp); END; END; {CASE} IF (Conduct AND (Latt^[lX]^[lY,lZ] < Occ)) THEN BEGIN {unoccupied site} {count newly accessible} inc(count); {insert into LNext} LInsert(LNext, lX, lY, lZ); {mark as occupied} Latt^[lX]^[lY,lZ] := Latt^[lX]^[lY,lZ] OR Occ; {check FOR breakthrough} IF ( (not Infinite) AND (lX = nx) AND ((Latt^[lX]^[lY,lZ] AND Xp) = Xp) ) THEN BEGIN Infinite := True; MinPath := here; END; END; {IF Conduct} END; {FOR dir} (* with *) END; (* operate *)
(* 2--------------------2 *)
BEGIN (* advance *) LNext := Nil; WHILE (Root <> Nil) DO BEGIN PSelf := Root; operate; Root := Root^.next; {advance through list} write('a'); dispose(PSelf); { <<=== program crashes here } write('b. '); END; (* while *) Root := LNext; END; {Advance}
The above makes it pretty obvious that the problem is NOT in the operate action, but in the list itself. The fact that dispose crashes shows that PSelf is not a valid pointer to a record, which includes *the fact that the record was created with new*
I agree that the problem is that PSelf is apparently not a valid pointer. I don't understand what you're saying about New. Is New not stable or trustworthy?
I think you should look into the areas where the list is created. You may have a global instance of an element, for example, that you are incorporating into the list somehow.
The list is created by 1) declaring the variable Root : PRec (outside procedure Advance), and 2) using the procedure LInsert to add elements to the list. LInsert is:
Procedure LInsert(var Root : PRec; var x, y, z : medCard); var temp : PRec; begin new(temp); with temp^ do begin rx := x; ry := y; rz := z; next := Root; end; Root := temp; end;
The only list manipulation is the LInsert procedure, and the dispose line that sometimes crashes the program. All records in the list come from LInsert. What I find disturbing is that this is a very simple piece of list work. The program sometimes runs through over 2,000,000 records without crashing - and sometimes crashes after about 100 records. Frustrating.
CBFalconer continues:
GPC new/dispose is dependant on the C malloc, I believe. If you are using DJGPP you can probably use my nmalloc, available on my site, which provides some hooks for examining the allocation chains and also detects most illegal frees. If it is linked ahead of the runtime library it will replace the library malloc stuff. But that should be only a last resort.
Yes, I'm using DJGPP and will check out your nmalloc. I'm also installing GPC on a Linux box so I can try te program there. Because this is memory weirdness, I'm wondering if at heart it's a windows problem.
Toby Ewing Soil Scientist, Iowa State University
Toby Ewing wrote:
CBFalconer wrote:
Toby Ewing wrote:
I'm having trouble with list management again.
... snip ...
It's probably an obvious problem, and I'm hoping fresh eyes can see it.
All I have done below is reformat, so I could follow it, and separate out the mundane operations into a local procedure, so that the actual list manipulation stands alone, and also because I have no idea what that piece of code is doing. Many variables should be moved into the local, and the item to manipulate should be passed into it, but that is another matter.
It's a trade-off between clarity and speed. The loop will execute several billions of times, and the run time is several days. I try not to call functions or procedures from inner loops. But as you say, that's a whole other discussion.
You are doing enough in that routine that there should be no perceptible difference in speed. The first rule of optimization is don't. The second is profile first.
Now go to the end
Procedure Advance(VAR Root : PRec; {rx, ry, rz: medcard; next: PRec} VAR Latt : PPLary; (3D lattice in schema format} VAR here, count : medCard); {Local to NumLatt.
- Identify adjacent, accessible sites.
- if sites are unoccupied, insert them into LNext.
- Delete Self from Root.
- Set Root = Lnext.}
VAR LNext, PSelf : PRec; lx, ly, lz, {neighboring locations} dir : medCard; {directions} Conduct : boolean; {=accessible site}
(* 2--------------------2 *)
PROCEDURE operate;
BEGIN (* operate *)
... snip ...
END; (* operate *)
(* 2--------------------2 *)
BEGIN (* advance *) LNext := Nil; WHILE (Root <> Nil) DO BEGIN PSelf := Root; operate; Root := Root^.next; {advance through list} write('a'); dispose(PSelf); { <<=== program crashes here } write('b. '); END; (* while *) Root := LNext; END; {Advance}
The above makes it pretty obvious that the problem is NOT in the operate action, but in the list itself. The fact that dispose crashes shows that PSelf is not a valid pointer to a record, which includes *the fact that the record was created with new*
I agree that the problem is that PSelf is apparently not a valid pointer. I don't understand what you're saying about New. Is New not stable or trustworthy?
Perfectly stable and trustworthy.
I think you should look into the areas where the list is created. You may have a global instance of an element, for example, that you are incorporating into the list somehow.
The list is created by
- declaring the variable Root : PRec (outside procedure Advance), and
- using the procedure LInsert to add elements to the list.
LInsert is:
Procedure LInsert(var Root : PRec; var x, y, z : medCard); var temp : PRec; begin new(temp); with temp^ do begin rx := x; ry := y; rz := z; next := Root; end; Root := temp; end;
The only list manipulation is the LInsert procedure, and the dispose line that sometimes crashes the program. All records in the list come from LInsert. What I find disturbing is that this is a very simple piece of list work. The program sometimes runs through over 2,000,000 records without crashing - and sometimes crashes after about 100 records. Frustrating.
CBFalconer continues:
GPC new/dispose is dependant on the C malloc, I believe. If you are using DJGPP you can probably use my nmalloc, available on my site, which provides some hooks for examining the allocation chains and also detects most illegal frees. If it is linked ahead of the runtime library it will replace the library malloc stuff. But that should be only a last resort.
Yes, I'm using DJGPP and will check out your nmalloc. I'm also installing GPC on a Linux box so I can try te program there. Because this is memory weirdness, I'm wondering if at heart it's a windows problem.
I assume you have a declaration somewhere like:
TYPE prec = ^rec; rec = RECORD ... next : prec; END;
and I was envisioning you had done something like declare
VAR root : rec; ...
but you say not. Have you ever initialized root to be NIL?
VAR root : prec;
.... root := NIL; (* <---- absolutely crucial *) WHILE whatever DO linsert(root, ...); ....
If you don't have ECC memory it is conceivable you have hardware problems. Another possibility is some bad indexing fouling the chains (gpc doesn't have range checking :-[) where the nmalloc stuff might come in handy.
You could build a list validator, which only walks the list from root to final NIL. To use it add a field set to a constant value, and the validator checks that value is there for each entry. Should be quite fast, and you can then call it here and there in the program. This would give an early indication of fouled storage.
CONST VALIDMARK = 1234;
TYPE rec = RECORD validation : integer; .... END;
PROCEDURE linsert(....) .... new(temp); temp.validation := VALIDMARK; ...
PROCEDURE listvalidate(calledfrom : integer);
LABEL 10;
VAR temp : prec;
BEGIN (* listvalidate *) temp := root; WHILE temp <> NIL DO IF temp.validation <> VALIDMARK THEN BEGIN takesomeaction(calledfrom); goto 10; END; 10: END; (* listvalidate *)
(you may notice I don't believe in using things outside of ISO7185 without very good reasons, none of which I see here <g>)
If all else fails see line 2 of the sig.