Mirsad Todorovac wrote:
Additionally I am interested whether there is a plan to introduce NX bit support (or emulation), and about StackGuard or similar "canary" protection.
I suppose this means non-executable, for stack memory?
As you correctly concluded.
Does NX bit use depend on compiler support at all, or is it completelly OS-supported/dependent? I reckon there ought to be something in RTS mmap()-ing stack area with PROT_EXEC disabled or am I wrong?
The stack is allocated (intially and grow) by the OS automatically, so it's an OS issue. However, actually GPC is affected by it (more precisely, those GCC backends that implement trampolines on the stack, with those frontends (languages) that have local routines and pointers/references to routines which includes, of course, Pascal). There's actually code in several backends to disable PROT_EXEC for stack areas where needed, as a work-around.
How do they implement PROT_EXEC disable w/o hardware NX bit, I wonder? I've heard of emulations but I never understood how this works. Probably I should do my homeworks ;-)
Why without? AFAIK, PROT_EXEC is (roughly speaking) the software side of hardware NX.
BTW, according to http://en.wikipedia.org/wiki/NX_Bit:
: Although this sort of mechanism has been around for years in various : other processor architectures such as Sun's SPARC, Alpha, IBM's : PowerPC, and even Intel's IA-64 architecture (also known as Itanium : or Merced processor), the term is actually a name created by AMD for : use by its AMD64 line of processors, such as the Athlon 64 and : Opteron. It seems to have now become a common term used to : generically describe similar technologies in other processors. : Intel's x86 processors included this feature since 80286 processor, : but that memory model is treated as obsolete by modern processors : and operating systems. De facto it could not be used by modern : programs, and AMD re-implemented the feature for the Flat memory : model used now.
So if you thought this was a brand-new hardware feature, it really isn't (apart from the name). Just Intel had screwed up for too long ...
I've seen an article that explains heap overrun exploit in detail, and how it was made impossible by heap randomizations in Windows XP SP2. But it is slightly off-topic, if we do not decide to introduce a better heap allocator option to GPC RTS.
Randomization might help to spoil particular attacks, or make them less likely to succeed, but cannot provide perfect protection. BTW, this should also be possible with a plug-in MM replacement.
DO you know of an example of memory mapping holes being deallocated? In fact, statistically most of the holes will be less than page a size, wouldn't they?
I don't really have statistics. I suppose when deallocating a list, some pages would be freed completely and be available either for returning to the OS, or at least for reuse within the same process (possibly with other chunk sizes, as needed). I suppose typical MMs do at least the latter.
This requires indirect pointers. Then all pointers to certain memory area could share a base pointer, and IMHO it is not a big complication nor speed delay compared to enhancements we get!
It is! Any solution that changes that ABI or something like that, and thus at least requires recompilation of all libraries is a big deal in practical terms (and IMHO should not even attempted in a single language on its own, unless that language operates in an isolated world, sandbox or whatever).
For example, all base pointers could fit on few memory pages that would easily fit in processor caches, and they would since often used. So, this form of indirection does not appear to be costly in terms of raw performance.
I'm skeptical. First, the cache used is not available for other purposes. Second, it takes additional statements which take time to execute and consume cache for the code. And since the change is pervasive, you don't even have the option to disable it in tight inner loops; you can only hope the optimizer is smart enough to move as many indirections as possible outside of the loop.
(Frankly, range checking is costly also, isn't it?)
Yes, but you can avoid it, e.g. by typing your counters and index variables with appropriate subranges, thus moving the necessary checks outside of critical inner loops (without needing any compiler options to turn them off temporarily, and not relying on the optimizer, but guaranteed by standard language features). Chuck Falconer has written about this more than once on this list.
But the idea of memory compaction and de-fragmentation still seems very good to me, even thought it looks like SF now. Probably the best would be to try to implement a library instead change to compiler or RTS as a first step, right?
Yes, I think so.
I understand the power-of-two heaps idea, but it still seems to me that there is a vast space for defragmentation.
Power of two is probably not the final word on this issue. But you might want to run some actual tests with real-world programs and current memory managers (say, that of glibc) to get significant numbers.
Some of these issues have been discussed here ( http://www.javaworld.com/javaworld/jw-08-1996/jw-08-gc-p3.html ):
[...]
(Sorry if this is too long of a paste)
GC is really a science of its own (literally), and IMHO it's a bit off-topic here.
What I had in mind in the beginning is essentially the third strategy: "registered pointers". The new type of pointers would have to be registered, and heap manager would have to update all registered pointers to reflect the heap area copy-and-move. As it would be done on-the-fly, and all pointers would point to the same byte of data after move is made, the defragmentation would be transparent to any program.
But it's also pervasive, i.e. affect all libraries (since they may copy pointers). And it would have to take care of pointers on the stack, in registers, etc.
OK, thanks for pointing that. I am trying to study the Boehm papers. Perhaps all I proposed has already been done :-(
It's surely been discussed at length, and there are experts in this area, which both of us are not, and this list is not really the place to discuss it ...
Better heap use would allow for introducing "heap canary" technique that would eliminate another source of exploits
According to http://en.wikipedia.org/wiki/Stack_frame, canaries are there to detect overflows in the first place.
It also describes a way to prevent such attacks to the return address on the stack, by detecting them before the return instruction (which may not be 100% foolproof, with longjmp etc., but rather reliable), but I don't see how this directly translates to the heap. Of course, you could check a canary before *every* access to some heap allocated variable, but that would be quite some overhead (perhaps only useful for debugging), and probably still not easy to implement -- if you have, say, a var parameter or a pointer parameter, how does the routine know if it has a canary (allocated with one), or not (allocated from foreign code, or global or local variables not on the heap).
IMHO, canary ought to be checked on free(). This would catch a number of errors, since most common error alongside buffer overrun is probably of-by-one error in loop.
This would serve as a debugging aid (similar to efence), not as attack prevention, as an attack can occur before free().
Obviously, the source code. I haven't been very clear with the obvious fact that changing pointer semantics and/or size into indirect pointer or pointer with upper and lower boundary would inevitably imply at least recompilation of libraries, assuming that indirection would be implemented transparent to existing source. And having in mind that GPC uses libc extensively, this may not be feasible in terms of necessary wrappers.
Actually it means that the only possibly successful way is probabl doing it all in the backend, so it would work the same for all languages. Perhaps you can get such an option in the backend. Then you basically only have to compile two versions of your whole system ... ;-)
IMHO, I see it rather as a separate project. GPC is not really short of features (existing and wishlist), and due to the considerations above, it seems easily separable (unless you really want to add compiler-supported checking which you don't seem to want, according to the previous paragraph).
The problem is: why all those fancy mechanisms of heap protection as libefence are not used more widely? Simply because they are not handy, and few people know of them. Having it seamlessly distributed with compiler or as a language option might make people consider using it.
IMHO it would be considerably less work to create and distribute an integrated environment containing the existing tools and making them easily available than writing something entirely new just for this reason (i.e., unless it has other advantages).
The basic inspiration came from my studying of month-and.-half long virus invasion and recent network and system intrusions I faced helplessly in last (sort of) six months as the system administrator.
If the buffer overrun protection isn't elegant, seamless and nearly mandatory, programmers might not use it I'm afraid.
Can we do something in this direction?
I'm afraid we probably cannot solve The Computer Security Problem within the next two weeks. ;-)
And, BTW, this is also an area of its own, with its experts. This does not mean we should not care here, but one should really first study existing work and state of the art. If implementation of some techniques require compiler support, we can discuss them here, but this is not really the place to design new techniques (which most likely will have been discussed by the experts already).
PS. I ope this doesn't get in HTML on the list, since this is first time I use Thunderbird.
No, it didn't.
Frank