In this post, I want to share some key concepts about the process known as ‘compilation’, specifically about what happens when we compile a C Code file.
What is compile?
Compile is a process that transforms an entire program from one language to another, usually from human programming high-level language to machine language(Binary), but as we are going to see it can also be translated into intermediate code.
At first glance, we could think that simple programs like printing small set characters, for example, the typical ‘Hello, world!’ program, is something super basic, the truth is, there is a lot of thing going on under the hood.
There are different types of compilers, I’m not going to go deeper into this topic, but if you want to see more information about compilers go to this link.
The most popular compiler is called gcc (the GNU compiler collection) and it can be found in most Unix-Like operating systems, but there are compilers for Windows.
The above image is a basic example, we have the C Code file ‘main.c’, as we know all ‘.c’ extension files are supposed to be C files, it has the ‘Hello, world!’ program.
Is the command used to compile, it creates the ‘a.out’ file, which is the standard file when compiling, it is an executable file that works. When there is an error, like syntax error or something missing, the compiler will prompt an error.
But let’s see what is happening more deeply, the next image shows some steps that happen in the compilation process with gcc:
The compiler analyzes the source code and removes the comments, searches the header files and replaces the macros with specified values. The header enunciates libraries, these are collections of precompiled functions that were used in the source code. Macros are like variables that are enunciated and given a value in the header.
Libraries can be found in the /usr/lib/ folder on UNIX OS Systems.
To compile to this type of file we use the command:
gcc -E main.c -o pre-main
For more options go to the gcc manual.
Next step is the compiler, it transforms the C code into Assembly code, it is more low-level language, we are getting closer to the machine language. It was a very used language used before high-level languages. The command for this step is gcc -S [file.c]
gcc -S main.c
This command creates by default a ‘.s’ file
Assembler translates the code to what is known as object code, that is, a language that communicates directly with the machine. Here the software can be interpreted by the computer, in this step the code is not understandable by humans at first sight.
The purpose of linker is to ‘link’ the library request and the object code, also when working with modules will result in some objects code that must be linked, in other words, link multiple compiled C files. Finally, the linker generates the executable file, the file resulting from the compilation process with gcc by default is a file called a.out, as we can see in the first image (Compile main.c) but if we type the ‘-o’ option for the command gcc, we can give it another name.
gcc [file_name] -o [defined_name_file]