At the university I’m attending, Hellenic Open University, they have developed, years ago, a custom programming language and a custom compiler for it. The language is ok, feels a bit like Pascal and the command’s are all in Greek. For example :

ΑΛΓΟΡΙΘΜΟΣ ΔΟΚΙΜΗ
	ΔΕΔΟΜΕΝΑ x:INTEGER;
ΑΡΧΗ
	ΤΥΠΩΣΕ ("PLEASE ENTER A NUMBER: ", EOLN);
	ΔΙΑΒΑΣΕ (x);
	ΤΥΠΩΣΕ ("YOU GAVE: ", x, EOLN)
ΤΕΛΟΣ

The compiler on the other hand… is one of the worst compilers I’ve used ever. It’s very buggy and it’s also having millions of encoding problems which they originate from the fact that it expects files to be in Windows-1253 encoding while most of the editors in our days use UTF-8. It’s a compiler you can use, but it doesn’t fit it’s purpose. So it’s bad. What I mean by this, is that there is way to use this compiler : use Notepad++ and check constantly if your encoding is Windows-1253; workaround the compiler’s bugs by writing your code with same logic but a bit differently. All this, makes coding in that language and using this compiler harder than coding in C, which is touted later and is considered harder… oh sweet summer child, if you only knew (!!! 😂 !!!). But the purpose of this compiler, was to be introducing and easy, so this compiler have failed it’s purpose for good. And it’s not only me who will say this. Aristotle argued that something is good when it fulfills its “τέλος” (purpose or function), so there are good chances Aristotle would consider this compiler “bad”.

As a cherry on the top, this compiler doesn’t work on Linux. So it’s trash.

It was clear that something needed to be done with this compiler.

I managed to get this compiler to work on Linux after lot of tweaking with “wine” compatibility tool and because I didn’t wanted anyone else to go through all this, I made a public repository [1] with a ready solution. You can find everything here: https://github.com/rept0id/hou-compiler-linux/.

This “wine” solution makes using this compiler on Linux not only possible but to be actually better that it is in the rest of the platforms. This is because, this solution before it runs the compiler and the produced executable through wine, it also encodes the provided file from UTF-8 to Windows-1253. This way, you write with a sane person’s tools, you know, the average editor of the simple layman out there, that uses UTF-8, and then another file with same content but Windows-1253 gets created and provided to the compiler. But you don’t care about the other file as it’s just a middle step. You write UTF-8.
Now you can focus on your code instead of the encodings.

This solution mentioned above, is “good” (by Aristotle) but it’s not “perfect” (by me, even though Aristotle’s opinion matters more). Because something with this compiler and wine bugs, if you try to print Greek characters through your program it will display them wrong (as a workaround, use Greek with Latin characters, Greeklish, or just English). Such a thing is very annoying for a compiler and whole concept meant to work with Greek. This actually makes still the program “not good” by Aristotle. This must be fixed as well else the solution is not perfect. I believe there are 2 ways to fix this, the one is to keep fuck-around-find-out with Wine and the second is to reverse engineer this compiler totally and make a better one. Guess which one I find more fun!

First step into reverse engineering was to throw this compiler into a classic .NET reverse engineering tool, like .NET Reflector [6] and the newer DotPeek [7], because only those ones I knew from a friend. Actually, .NET it’s the best case of reverse engineering a program because the code that you write in C#, VB, F# or whatever, gets transposed into bytecode for “Common Language Runtime” which is considered easier to reverse engineer that raw binary machine code. Those tools, if the code is not “obfuscated” against them, they will give you a very good result. This bytecode thing makes me consider all those languages to be in the same basket as Java; even though they don’t claim that they run inside a VM, they aaalmost do something that could be labeled as similar. Turns out this compiler is not a .NET program and I couldn’t get something out of using those tools.

Next step, was to guess better what this program was made with. I ran strings ./pli10.exe in a Linux terminal and saw a little bit the result. I saw many “GCC” (strings ./pli10.exe | grep "GCC" to filter them) and made a conclusion that this may be a C/C++ program, since GCC is a C compiler. I can’t be 100% sure as this compiler uses tdm-gcc to generate the machine code, so maybe it’s written in something different than C/C++ and it just calls the GCC compiler. So I had to do what any programmer that likes math would have done and say the poem : “let assume this program is C/C++” and continue based on this.

I didn’t knew any good C/C++ disassembler so ChatGPT and Google were my friends and I found that actually there are out there 2 very good ones :

  • Ghidra [2][3], made by NSA
  • Interactive Disassembler (IDA) [4], made by Ilfak Guilfanov [5]

I managed to get Ghidra and disassemble the compiler back to code. But still, I wasn’t able yet to find many things. Not yet.

By the way, Ghidra can disassemble other languages too like Go and Rust as far as I’ve read.

Still, the journey doesn’t end here, as I’ve already found a line that is very interesting :

system("chcp 1253 > nul");

Maybe, this program could work much better if this legacy Windows-1253 encoding wasn’t enforced.

Links

  1. https://github.com/rept0id/hou-compiler-linux
  2. https://en.wikipedia.org/wiki/Ghidra
  3. https://github.com/NationalSecurityAgency/ghidra
  4. https://en.wikipedia.org/wiki/Interactive_Disassembler
  5. https://en.wikipedia.org/wiki/Ilfak_Guilfanov
  6. https://en.wikipedia.org/wiki/.NET_Reflector
  7. https://www.jetbrains.com/decompiler/

Last modified: 14 Μαρτίου, 2025

Author

Comments

Write a Reply or Comment

Your email address will not be published.