Is this a joke?
If not, I suggest you find out what "compiler" means! <G>
Joe.
-----Original Message----- From: a2782@dis.ulpgc.es [SMTP:a2782@dis.ulpgc.es] Sent: Wednesday, December 11, 2002 10:31 AM To: gpc@gnu.de Subject: Translation Scheme
Hi to all!
Has anyone a Translation Scheme from Pascal to Intel 80386 code? Or does anyone know how to get it?
Thanks in advance.
Mmm?
No, it\'s not a joke! I hope you tell me what a \"compiler\" means. I used to think that a compiler generates an object code (or intermediate code) from a source code in several phases... but it seems that this definition is wrong!
Anyway, I need to convert a Pascal source in a i386 code. The Translation Scheme for this would be useful for me.
Thanks in advance!
Mensaje citado por: "da Silva, Joe" Joe.daSilva@emailmetering.com:
Is this a joke?
If not, I suggest you find out what "compiler" means! <G>
Joe.
-----Original Message----- From: a2782@dis.ulpgc.es [SMTP:a2782@dis.ulpgc.es] Sent: Wednesday, December 11, 2002 10:31 AM To: gpc@gnu.de Subject: Translation Scheme
Hi to all!
Has anyone a Translation Scheme from Pascal to Intel 80386 code? Or does anyone know how to get it?
Thanks in advance.
Hi!
On Wed, Dec 11, 2002 at 09:00:46AM +0000, a2782@dis.ulpgc.es wrote:
to think that a compiler generates an object code (or intermediate code) from a source code in several phases... but it seems that this definition is wrong! Anyway, I need to convert a Pascal source in a i386 code. The Translation Scheme for this would be useful for me.
I do not know, what "Translation Scheme" means, but if you have some Pascal code like:
program Hello;
begin WriteLn ('Hello, World!') end.
in a file called "hello.pas" You just have to run a compiler, such as the GNU-Pascal Compiler with: gpc hello.pas -o hello to produce an executable.
Afterwards, just run `hello' on your shell.
If you want something else, please be more precise.
Eike
Ok, ok. It's true, maybe I have to be more precise.
The problem may be in the words 'Translation Scheme' :). In fact, I've translated it from the Spanish expression 'esquema de traducción'.
I ask for the correspondency between Pascal structures and i386 code. For example, when the parser reads a 'procedure' declaration, what i386 instructions are generated (movl, enter, push...). When the parser reads a assignment (a:=b), what code is generated and so on...
I know that GPC does something different, because it's 'only' a front- end (and uses Trees and RTL). But I wonder if someone of this mailing- list has this scheme.
Thanks!
Mensaje citado por: Eike Lange eike.lange@uni-essen.de:
Hi!
On Wed, Dec 11, 2002 at 09:00:46AM +0000, a2782@dis.ulpgc.es wrote:
to think that a compiler generates an object code (or intermediate code) from a source code in several phases... but it seems that this definition is wrong! Anyway, I need to convert a Pascal source in a i386 code. The Translation Scheme for this would be useful for me.
I do not know, what "Translation Scheme" means, but if you have some Pascal code like:
program Hello;
begin WriteLn ('Hello, World!') end.
in a file called "hello.pas" You just have to run a compiler, such as the GNU-Pascal Compiler with: gpc hello.pas -o hello to produce an executable.
Afterwards, just run `hello' on your shell.
If you want something else, please be more precise.
Eike
On 11 Dec 2002 at 10:50, a2782@dis.ulpgc.es wrote:
Ok, ok. It's true, maybe I have to be more precise.
The problem may be in the words 'Translation Scheme' :). In fact, I've translated it from the Spanish expression 'esquema de traducción'.
I ask for the correspondency between Pascal structures and i386 code. For example, when the parser reads a 'procedure' declaration, what i386 instructions are generated (movl, enter, push...). When the parser reads a assignment (a:=b), what code is generated and so on...
I know that GPC does something different, because it's 'only' a front- end (and uses Trees and RTL). But I wonder if someone of this mailing- list has this scheme.
GPC can generate assembler output (IIRC, in AT&T syntax). Run the command: gpc -S foo.pas
You will get an assembler file called foo.s, which contains all the "movl", "push" or whatever, that you could care to see.
Best regards, The Chief --------- Prof. Abimbola Olowofoyeku (The African Chief) Web: http://www.bigfoot.com/~african_chief/
On Wed, 11 Dec 2002 a2782@dis.ulpgc.es wrote:
Ok, ok. It's true, maybe I have to be more precise.
The problem may be in the words 'Translation Scheme' :). In fact, I've translated it from the Spanish expression 'esquema de traducci�n'.
I ask for the correspondency between Pascal structures and i386 code. For example, when the parser reads a 'procedure' declaration, what i386 instructions are generated (movl, enter, push...). When the parser reads a assignment (a:=b), what code is generated and so on...
I know that GPC does something different, because it's 'only' a front- end (and uses Trees and RTL). But I wonder if someone of this mailing- list has this scheme.
GPC is not a stand-alone front-end. It is like the front 2 wheels of a 4 wheel car. You need all 4 wheels to go anywhere.
There is a little bit of confusion because when you combine the GPC front- end with the back-end the result is still called GPC.
Russ
I know it. For this reason, I am not going to see through the sources of GPC...
Quoting Russell Whitaker russ@ashlandhome.net:
GPC is not a stand-alone front-end. It is like the front 2 wheels of a 4 wheel car. You need all 4 wheels to go anywhere.
There is a little bit of confusion because when you combine the GPC
front-
end with the back-end the result is still called GPC.
Russ
Eike Lange a écrit:
Hi!
On Wed, Dec 11, 2002 at 09:00:46AM +0000, a2782@dis.ulpgc.es wrote:
to think that a compiler generates an object code (or intermediate code) from a source code in several phases... but it seems that this definition is wrong! Anyway, I need to convert a Pascal source in a i386 code. The Translation Scheme for this would be useful for me.
I do not know, what "Translation Scheme" means, but if you have some Pascal code like:
program Hello;
begin WriteLn ('Hello, World!') end.
in a file called "hello.pas" You just have to run a compiler, such as the GNU-Pascal Compiler with: gpc hello.pas -o hello to produce an executable.
May be you want an assembly listing ?
then type
gpc -S hello.pas
you will obtain hello.S
But you may have a surprise: it is AT&T assembly syntax, not any variant of intel syntax. I do not know if there is any way to convert (or to output it directly).
Maurice
Well, it's an option. I could write some simple programs in Pascal and see what assembly code is being generated.
But I think it should be a formal document with this step. I mean, when somebody implements a compiler, he/she must follow several steps: lex, parsing, generation of symbol table and object code generation. Well, I search for a scheme with the correspondencies between the structures being parsed and the code generated.
E.g.:
program: program id other body; --> initialize memory
body: begin instructions end ';' --> generate instructions 'enter', 'leave'
...
I'm not interested in sources of GPC, because GPC is designed as a front-end, and I search for the back-end...
Thanks for the patience!
Mensaje citado por: Maurice Lombardi Maurice.Lombardi@ujf-grenoble.fr:
Eike Lange a écrit:
Hi!
On Wed, Dec 11, 2002 at 09:00:46AM +0000, a2782@dis.ulpgc.es wrote:
to think that a compiler generates an object code (or intermediate code) from a source code in several phases... but it seems that
this
definition is wrong! Anyway, I need to convert a Pascal source in a i386 code. The Translation Scheme for this would be useful for me.
I do not know, what "Translation Scheme" means, but if you have some Pascal code like:
program Hello;
begin WriteLn ('Hello, World!') end.
in a file called "hello.pas" You just have to run a compiler, such
as
the GNU-Pascal Compiler with: gpc hello.pas -o hello to produce an executable.
May be you want an assembly listing ?
then type
gpc -S hello.pas
you will obtain hello.S
But you may have a surprise: it is AT&T assembly syntax, not any
variant
of intel syntax. I do not know if there is any way to convert (or to
output
it directly).
Maurice
-- Maurice Lombardi Laboratoire de Spectrometrie Physique, Universite Joseph Fourier de Grenoble, BP87 38402 Saint Martin d'Heres Cedex FRANCE Tel: 33 (0)4 76 51 47 51 Fax: 33 (0)4 76 63 54 95 mailto:Maurice.Lombardi@ujf-grenoble.fr
Well, it's an option. I could write some simple programs in Pascal and see what assembly code is being generated.
But I think it should be a formal document with this step. I mean, when somebody implements a compiler, he/she must follow several steps: lex, parsing, generation of symbol table and object code generation. Well, I search for a scheme with the correspondencies between the structures being parsed and the code generated.
E.g.:
program: program id other body; --> initialize memory
body: begin instructions end ';' --> generate instructions 'enter', 'leave'
...
I'm not interested in sources of GPC, because GPC is designed as a front-end, and I search for the back-end...
It is still not clear what you really want. I will assume that you want to understand how a compiler works (maybe write a simple one). Gcc is an optimizing compiler, working in many passes and there is really NO simple scheme -- genereted code depend in highly "non additive" way on the source, in particular at last stages the generated code is rearranged to allow more instructions to execute in parallel and also some sequences of instructions are replaced by better ones. As long as I know you can disable most of the optimizations (using -O0 flag) but some optimizations still are performed. In fact turning optimizations on/off changes which procedures are used in some stages to generate code so the translation scheme really depend on exact switches you gave to gcc. In theory one should precisely describe the expected effect of various transformations and only then begin to code the compiler. In practice tiny little details matter most, and once you spelled out exactly every little detail, then you realize that you really have a computer program. It it wastefull to code the same computation twice (and there is little hope that two different programs will perforom the same computation anyway), so the only formal description what gcc is doing is the source code. If you want informal overview of how gcc works besides gcc docomentation you may look at: http://cobolforgcc.sourceforge.net/cobol_14.html where you can find probably the best (however incomplete) descripition of interface between front end and the back end. If you want to know the exact rules used to produce i386 instructions from gcc-internal data you may look at gcc/config/i386/i386.md (but that file is cryptic, I will not dare to modify it).
I think that one general remark is in place: the compiler perform translation in many phases, even if each phase is very simple (not the case of gcc) the final effect may appear complex -- in other words if you try to describe the process as a single step then the description becames very complex.
If you want a very simplified decscription of the whole process here it goes:
fist stage -- build data structures in compiler: collect type, variable and procedure declarations, store procedure bodies as trees representing sequences of instructions, loops, conditionals, procedure calls and assignments the main program is treated as a body of fictional procedure
the first stage is almost independent of the target.
second stage -- allocate variables: in Pascal basically you compute how much space the variables will take
in the second stage you have to know how big basic types are, and possibly alignment rules -- on i386 you may use no alignment but you get better performance (and compatiblity with other compilers) if you allocate 2 byte variable on even adresses and 4 byte variables on adresses divisible by 4 (on newest processors you should also align 8 byte variables)
now we can generate code: the bulk of code are expressions, they are trees with simple (binary or unary) operators or function calls in the nodes. We may assume that simple operators work on integers and that operation correspod to a single machine instruction -- other operators are replaced by function calls. One big work when translating expressions is register allocation. In simple scheme you allocate a temporary variable for each tree node and just fetch value from memory before any operation and store the result after the operation. So x := x + y; becomes movl x,%eax movl y,%ebx addl %ebx,%eax movl %eax,x if x and y are global variables. For local variables you use its offset inside a stack frame like: movl -8(%ebp),%eax for boolean expressions like x := y > z; the code looks: movl y,%eax movl x,%ebx cmpl %eax,%ebx jg l1 movl $0,%eax jmp l2 l1: movl $1,%eax l2: movl %eax, x
for conditional instruction if cond then I1 else I2; the code looks: Computation of cond movl cond, %eax # cond is a temporary to store value of condition cmp $0, %eax jz l1 Translation of I1 jmp l2 l1: Translation of I2 l2 Capital letter above means you need to expand corresponding fragment
procedure call: foo(expr1, expr2, expr3) -- all parameter by value Compute expr3 push expr3 Compute expr2 push expr2 Compute expr1 push expr1 call foo
procedure body: Prolog Expand body instructions Epilog
Prolog and Epilog depend on exact calling convention, with the scheme above only ebx and esp need saving so simple enter and leave are enough, but gcc wants to have more registers preserved, so one have to push them on the stack in the Prolog and pop them in the Epilog.
Pointer dereference: x := y^; movl y, %eax movl (%eax),%eax movl %eax, x
Array and othere complex data is effectively represented by pointers (adresses) -- given address of the whole object compiler computes address of a component --- you may reduce the whole Pascal to simple fragments like above (well they are similar but formally you need more such fragments) and adress computations.
If you try to understand compilers you probably look at something simpler then whole Pascal -- you may find on the net examples of compilers for some subsets of Pascal. The scheme I presented above can handle full Pascal, but given in full detail would be long, boring and (I belive) not easier to understand due to lot of details --- and the resulting compiler would generate really lousy code (IMHO gcc gives you 10-20 times faster code)
a2782@dis.ulpgc.es wrote:
But I think it should be a formal document with this step. I mean, when somebody implements a compiler, he/she must follow several steps: lex, parsing, generation of symbol table and object code generation.
Indeed. GPC does the former three and the backend (which is part of the `gpc' exectuable, but not of the sources we commonly refer to as GPC) does the latter.
Well, I search for a scheme with the correspondencies between the structures being parsed and the code generated.
E.g.:
program: program id other body; --> initialize memory
body: begin instructions end ';' --> generate instructions 'enter', 'leave'
...
I'm not interested in sources of GPC, because GPC is designed as a front-end, and I search for the back-end...
In practice, it's not quite as simple. E.g., the procedure enter/exit instructions may depend on the formal parameters, on the optimization options, of course, and inline routines don't have them at all.
Roughly, what you're looking for is in the machine descriptions files below gcc/config/. However, it's probably not in a form you imagine -- it's written in a kind of Lisp, and it builds upon an internal representation (RTX), which is already quite different from the Pascal source.
For more detailed questions, you might want to ask the GCC list where the backend is maintained.
Frank
In general, the compilers that belong to the GNU compiler collection all follow the same architecture which is:
- a language specific front end The front end is responsible for the lexing, parsing, handling the symbol table etc. It may do some language specific optimisations (like inlining for example) and it will finally generate an intermediate representation in RTL (Register Transfer Language). RTL represents an idialized processor with an unlimited number of registers. A number of frontends first represent the program in tree form before actually generating the RTL code.
- a middle part The middle part of the compiler is common to all langauges and all processors. It does a number of optimizations on the RTL representation of the program and it will run the register allocator which tries to map the idialized registers representation to the regial registers or stack slots.
- a processor specific back end the backend transforts the RTL representation to the actual processor specific code and does a number of processor specific peephole optimizations.
While these are functionally 3 different parts, they are actually linked together into one executable. There are no specific programs for each part of the compilation.
Given the program goes through so many steps, you have no simple one to one mapping between initial language contructs and the code generated. Especially optimizations like inlining or certain aggressive optimizations in the middle end can make the produce code have little ressemblence with the original source code.
Marcel
----- Original Message ----- From: a2782@dis.ulpgc.es To: "Maurice Lombardi" Maurice.Lombardi@ujf-grenoble.fr Cc: gpc@gnu.de Sent: Wednesday, December 11, 2002 5:45 PM Subject: Re: Translation Scheme
Well, it's an option. I could write some simple programs in Pascal and see what assembly code is being generated.
But I think it should be a formal document with this step. I mean, when somebody implements a compiler, he/she must follow several steps: lex, parsing, generation of symbol table and object code generation. Well, I search for a scheme with the correspondencies between the structures being parsed and the code generated.
E.g.:
program: program id other body; --> initialize memory
body: begin instructions end ';' --> generate instructions 'enter', 'leave'
...
I'm not interested in sources of GPC, because GPC is designed as a front-end, and I search for the back-end...
Thanks for the patience!