Continuation Story: Pascal's own compiler for Windows from scratch

The unexpectedly warm welcome provided by the Habr public to my post about the home-made XD Pascal compiler for MS-DOS made me think. Is it not annoying that the amateur project, to which I gave a lot of energy, has been a dead weight for me since the time when the DOS virtual machine completely disappeared from Windows? The result of the reflections was the XD Pascal compiler for Windows . Perhaps he lost some of the nostalgic charm and lost the possibility of naive work with graphics through BIOS interruptions. However, the transition to Windows breathed new life into the project and opened the way to a long-standing dream - self-compilation.

As before, I did not use any auxiliary tools for automatic compiler generation. Such stubbornness may look strange, but the project had a single goal - my own pleasure, and additional tools would serve only as an obstacle. In this sense, the compiler was developed from scratch.



Five steps to self-compiling on Windows


It is worth saying a few words about the main tasks that had to be solved on the way from DOS to Windows:

Formation of headers and sections of the executable file. In addition to the official description of the Portable Executable format, the article Creating the smallest possible PE executable became an excellent help at this stage. Since the headers and sections require exact addresses of procedures and variables, and they can be found only after calculating the code size and global data, the compilation had to be done in three passes. At the first pass, a graph of procedure calls is built and “dead” procedures are marked; in the second, addresses, code and data size are calculated, headers are filled in; in the third, a code is generated. Such a kunshtuk is very inexhaustible, especially considering that at each pass all stages of compilation are repeated anew, starting with lexical analysis. However, it leads to a very concise source code for the compiler and does not require any intermediate representation of the program. Addition: currently implemented the generation of relocatable code, the compilation is done single-pass.

New code generator. Compilation for Windows required replacing pairs of segment-offset registers with 32-bit offset registers, as well as removing (and adding in places) prefixes for changing operand length (66h) and address length (67h).

Directive to declare external functions of the Windows API. All function names declared with the external directive are entered in the tables of the import section of the executable file. Since these functions require passing arguments from right to left, we had to manually invert the order of the arguments in the declaration and calls of all such functions. Thereby, the need for inversion by means of the compiler is no longer needed. For the sake of simplicity, all arguments to procedures and functions in XD Pascal are passed as 32-bit values; fortunately, this rule is also valid for the Windows API functions, so interaction with system libraries did not complicate the mechanism for passing arguments. Addition: inversion of the order of arguments of imported functions is now performed automatically.

Removing sets and infix string operations from source code. This requirement is related to the task of self-compilation. The calculation of any expressions in XD Pascal is designed so that all intermediate results are 32 bits long and are stored on the stack. For Pascal strings and sets, this approach is not acceptable. More precisely, it would have allowed sets to be up to 32 elements in size, but such sets would have been practically useless.

Wrappers for some procedures. The idea of ​​self-compilation led to wrapping calls to some routines in the standard library. The wrapper signature is the same for cases compiled by an external compiler (Delphi / Free Pascal) and self-compiled; wrapped procedures vary. Thus, all the specifics of the compilation method are localized within several wrappers. Pascal is replete with procedures that, upon closer examination, cannot be implemented according to the rules of Pascal itself: Read , Write , Move , etc. For the most common procedures, including Read and Write , I made an exception and implemented them atypical for the grammar of the language, but familiar to any connoisseur of Pascal. For most other non-standard procedures, wrappers were needed. Thus, XD Pascal is not fully compatible with Delphi or Free Pascal, but this is not a big deal, since even Free Pascal itself in the compatibility mode with Delphi actually remains incompatible. Addition: support for untyped formal variable arguments is now implemented. This allowed making the procedures BlockRead , BlockWrite , Move , FillChar compatible with Delphi and Free Pascal, thereby radically reducing the number of required wrappers.

Compiling Programs with a GUI


The task of self-compilation, despite its symbolic meaning, remains limited: the compiler is a console program and therefore does not look like a full-fledged inhabitant of the Windows world. It took a few more innovations on the way to compiling programs with a window interface:

Directive to the compiler for setting the type of interface. The type of interface (console or graphical) must be specified in a separate header field of the executable file. As you know, in Delphi and Free Pascal there is a directive $APPTYPE for this. A similar $A directive appeared in XD Pascal.

The operation of taking the address of procedures and functions. In classical Pascal there are no full-fledged pointers to procedures and functions - they are replaced to some extent by a procedural type. This type is not implemented in XD Pascal. Be that as it may, still applying the @ operation to the procedures in my modest project seemed to me useless. However, the processing of the Windows API events is based on callbacks, and here the transfer of the address of the called handler procedure suddenly became an urgent need.

Explicitly specifying the names of the linked libraries. For console programs, importing the Windows API functions from the KERNEL32.DLL library was enough. Programs with a GUI pulled USER32.DLL , GDI32.DLL , etc. It was necessary to expand the syntax of the external directive by adding the library name there.


GUI Demo

What is the result


The result is a very simple self-compiling compiler for Windows. It is unlikely to correctly compare it with powerful collective projects such as Free Pascal. Rather, he falls into the weight category of the famous amateur BeRo Tiny Pascal . Compared to it, XD Pascal has noticeable advantages: Pascal's grammar is more strictly observed and errors are controlled, there is full-fledged file input / output, arithmetic of floating-point numbers is supported, there is no dependence on external assembler, compilation of programs with a window interface is allowed.

Next, I have to deal with the false positives of some antiviruses - a new problem that I did not think about in the small cozy world of MS-DOS. If you are lucky, XD Pascal will be introduced, along with BeRo Tiny Pascal, in a laboratory workshop on the course of constructing compilers at MSTU. N.E. Bauman.

Source: https://habr.com/ru/post/462889/


All Articles