Re: Buffer Overrun Prevention in GPC

29 Jan 2006


      On Sat, 28 Jan 2006, Frank Heckenbach wrote:
...
...
How do they implement PROT_EXEC disable w/o hardware NX bit, I wonder?
I've heard of emulations but I never understood how this works. Probably
I should
do my homeworks ;-)
Why without? AFAIK, PROT_EXEC is (roughly speaking) the software
side of hardware NX.
Could be on some platforms. AFAIR, there was a platform (Linux?, it's so 
dim in my memory) which did not implement PROT_EXEC protection. AFAIK, 
Linux kernels generally did not until 2.6 versions. But you must have 
knwon this, and this is a backend issue I suppose.
...
BTW, according to http://en.wikipedia.org/wiki/NX_Bit:
: Although this sort of mechanism has been around for years in various
: other processor architectures such as Sun's SPARC, Alpha, IBM's
: PowerPC, and even Intel's IA-64 architecture (also known as Itanium
: or Merced processor), the term is actually a name created by AMD for
: use by its AMD64 line of processors, such as the Athlon 64 and
: Opteron. It seems to have now become a common term used to
: generically describe similar technologies in other processors.
: Intel's x86 processors included this feature since 80286 processor,
: but that memory model is treated as obsolete by modern processors
: and operating systems. De facto it could not be used by modern
: programs, and AMD re-implemented the feature for the Flat memory
: model used now.
So if you thought this was a brand-new hardware feature, it really
isn't (apart from the name). Just Intel had screwed up for too long
...
Agreed. Some of the things Intel gets away with are beyond my 
comprehension. Thank Gawd for AMD :-) who brought NX bit back ...
...
...
I've seen
an article that explains heap overrun exploit in detail, and how it was
made impossible by heap
randomizations in Windows XP SP2. But it is slightly off-topic, if we do
not decide to
introduce a better heap allocator option to GPC RTS.
Randomization might help to spoil particular attacks, or make them
less likely to succeed, but cannot provide perfect protection. BTW,
this should also be possible with a plug-in MM replacement.
I trust your experience and expertise on that.
...
...
DO you know of an example of memory mapping holes being deallocated? In
fact, statistically
most of the holes will be less than page a size, wouldn't they?
I don't really have statistics. I suppose when deallocating a list,
some pages would be freed completely and be available either for
returning to the OS, or at least for reuse within the same process
(possibly with other chunk sizes, as needed). I suppose typical MMs
do at least the latter.
Agreed. OTOH, if we run for example a database, insertions/deletions 
to/from heap may be seemingly random. I trust a lecture I've heard from my 
college Professor, and I think he had some references with extensive 
simulations at least. IMHO general purpose allocator should not rely on 
"burst" allocations/deallocations, but a more stochastic memory manager 
use.
I suppose GPC uses default libs's malloc/calloc/free, and AFAIK GPC libc 
malloc team is also thinking of improvement of default malloc in libc, as 
this would give immediate improvement even to already linked programs if 
ABI compatibility is preserved.
...
...
This requires indirect pointers. Then all pointers to certain memory
area could share a
base pointer, and IMHO it is not a big complication nor speed delay
compared to
enhancements we get!
It is! Any solution that changes that ABI or something like that,
and thus at least requires recompilation of all libraries is a big
deal in practical terms (and IMHO should not even attempted in a
single language on its own, unless that language operates in an
isolated world, sandbox or whatever).
OK, I see the point. I will have to do some serious study on this. It 
won't hurt to learn more about GPC internals and how it uses libc.
...
...
For example, all base pointers could fit on few memory pages that would
easily fit in
processor caches, and they would since often used. So, this form of
indirection does
not appear to be costly in terms of raw performance.
I'm skeptical. First, the cache used is not available for other
purposes. Second, it takes additional statements which take time to
execute and consume cache for the code. And since the change is
pervasive, you don't even have the option to disable it in tight
inner loops; you can only hope the optimizer is smart enough to move
as many indirections as possible outside of the loop.
True.
...
...
(Frankly, range
checking is costly also,
isn't it?)
Yes, but you can avoid it, e.g. by typing your counters and index
variables with appropriate subranges, thus moving the necessary
checks outside of critical inner loops (without needing any compiler
options to turn them off temporarily, and not relying on the
optimizer, but guaranteed by standard language features). Chuck
Falconer has written about this more than once on this list.
I see the difference.
...
...
But the idea of memory compaction and de-fragmentation still seems very
good to
me, even thought it looks like SF now. Probably the best would be to try
to implement
a library instead change to compiler or RTS as a first step, right?
Yes, I think so.
So we agree on something.
...
...
I understand the power-of-two heaps idea, but it still seems to me that
there is a vast
space for defragmentation.
Power of two is probably not the final word on this issue. But you
might want to run some actual tests with real-world programs and
current memory managers (say, that of glibc) to get significant
numbers.
I am putting it on my TO-DO list. It is a very interesting issue in 
general, for all operating systems that use paging virtual memory. (OTOH, 
comming back to releasing unused holes, unused pages will probably be 
swapped-out and not reloaded again since not used - the problem is 
stochastic alloc/dealloc of relativelly small fragments of memory. For 
example, if the average size of records is from 512 to 1023 bytes, this 
will use a lot on 2^10 heap, but after allocations/deallocations go into 
asymptotic stable state there will be normal distribution of used and 
unused areas on each memory physical page. This means that the allocated 
physical pages of heap will eventually double the program's memory needs. 
I may seek for literature, right now I speak by memory of those lectures 
about Unix processes.)
...
...
Some of these issues have been discussed here (
http://www.javaworld.com/javaworld/jw-08-1996/jw-08-gc-p3.html ):
[...]
(Sorry if this is too long of a paste)
GC is really a science of its own (literally), and IMHO it's a bit
off-topic here.
OK. I will try to stay on focus.
...
...
What I had in mind in the beginning is essentially the third strategy:
"registered pointers". The new type
of pointers would have to be registered, and heap manager would have to
update all registered pointers
to reflect the heap area copy-and-move.
As it would be done on-the-fly, and all pointers would point to the same
byte of data after move
is made, the defragmentation would be transparent to any program.
But it's also pervasive, i.e. affect all libraries (since they may
copy pointers). And it would have to take care of pointers on the
stack, in registers, etc.
I see. I realize adding security measures drastically impacts performance 
(such as making all pointers "volatile" variables which cannot go to 
registers), but having an important system brought down on it's knees by 
undetected buffer overrun in an application will hurt me more both as a 
system administrator and as a software developer than the 20% decrease in 
program speed. IMHO.
...
...
OK, thanks for pointing that. I am trying to study the Boehm papers.
Perhaps all I proposed has already been
done :-(
It's surely been discussed at length, and there are experts in this
area, which both of us are not, and this list is not really the
place to discuss it ...
I have understood your argument. I am trying to stick to those issues that 
hold on to greater security of Pascal programs, as it can be enforced by 
language.
...
...
IMHO, canary ought to be checked on free(). This would catch a number of
errors, since most
common error alongside buffer overrun is probably of-by-one error in loop.
This would serve as a debugging aid (similar to efence), not as
attack prevention, as an attack can occur before free().
True. However, several attack scenarios rely on smashing alloc list 
pointers and overwritting arbitrary location in memory. A canary could 
prevent that, if checked prior to evaluating pointers that follow it. :-)
Yet, again if GPC depends on libc malloc internally, then I guess it is a 
libac issue, not GPC issue.
...
...
Obviously, the source code. I haven't been very clear with the obvious
fact that changing pointer
semantics and/or size into indirect pointer or pointer with upper and
lower boundary would inevitably
imply at least recompilation of libraries, assuming that indirection
would be implemented transparent
to existing source. And having in mind that GPC uses libc extensively,
this may not be feasible in terms of necessary wrappers.
Actually it means that the only possibly successful way is probabl
doing it all in the backend, so it would work the same for all
languages. Perhaps you can get such an option in the backend. Then
you basically only have to compile two versions of your whole system
... ;-)
Right said: StackGuard does exactly that! I could try to apply StackGuard 
patch once GPC would have accepted 4.* backends.
...
...
...
IMHO, I see it rather as a separate project. GPC is not really short
of features (existing and wishlist), and due to the considerations
above, it seems easily separable (unless you really want to add
compiler-supported checking which you don't seem to want, according
to the previous paragraph).
...
...
The problem is: why all those fancy mechanisms of heap protection as
libefence are not used
more widely? Simply because they are not handy, and few people know of
them. Having it
seamlessly distributed with compiler or as a language option might make
people consider
using it.
IMHO it would be considerably less work to create and distribute an
integrated environment containing the existing tools and making them
easily available than writing something entirely new just for this
reason (i.e., unless it has other advantages).
I could be much less work, agreed. I did not insist on writing everything 
from scratch myself ;-) no matter how much I enjoy mucking with internals. 
I realize however the problem of code maturity, which my new code would 
inevitably be lacking.
...
...
The basic inspiration came from my studying of month-and.-half long
virus invasion and recent
network and system intrusions I faced helplessly in last (sort of) six
months as the system
administrator.
If the buffer overrun protection isn't elegant, seamless and nearly
mandatory, programmers might not use it I'm afraid.
Can we do something in this direction?
I'm afraid we probably cannot solve The Computer Security Problem
within the next two weeks. ;-)
;-)
...
And, BTW, this is also an area of its own, with its experts. This
does not mean we should not care here, but one should really first
study existing work and state of the art. If implementation of some
techniques require compiler support, we can discuss them here, but
this is not really the place to design new techniques (which most
likely will have been discussed by the experts already).
I guess you want to tell me I need to do more homework before raising 
similar issues, so I will try to do it next time. However, it is hard to 
become expert in a month, so I was relying on your experience ;-) May I be 
forgiven.
Thank you for your time, and I will now try to do some research.
Mirsad

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

Re: Buffer Overrun Prevention in GPC