Наши партнеры

Книги по Linux (с отзывами читателей)

Библиотека сайта rus-linux.net

Linux Compilers and Assemblers

SEP 19, 2003 By Christopher Paul, Rafeeq Rehman. Article is provided courtesy of Prentice Hall.

3.3 Compiling a Program

The GCC compiler is commonly invoked using the gcc command. The command accepts many command line switches that can be used to invoke different options for the compilation process. You can use the same command line switch multiple times. For example, if you need to specify multiple include paths, you can use –I option multiple times. However you can't combine two switches into one. For example –c and –o can't be combined as –co. This section provides information about different methods of compilation and things you need to consider in the compilation process. Please note that in most of software development projects, you don't invoke gcc from the command line. Instead, the GNU make utility that reads one or many Makefiles is used. These Makefiles contain information about how the compiler will be invoked and what command line switches will be used. Information about GNU make and Makefiles is presented in Chapter 5.

3.3.1 Simple Compilation

Consider the following C source code file, which is named hello.c. We shall frequently refer to this program in this as well as coming chapters.

#include <stdio.h>
main ()
  printf("Hello world\n");

To compile this file, you just need to run the following command.

[rr@conformix 4]$ gcc hello.c
   [rr@conformix 4]$

By default, this command will generate an output file named a.out, which can be executed on the command line as follows:

[rr@conformix 4]$ ./a.out
   Hello world
   [rr@conformix 4]$

Note that regardless of the name of your source code file, the name of the output file is always a.out. You may actually know what an ordinary a.out file is, but this isn't one of them. It is an elf file, despite its name. If you want to create an output file with a different name, you have to use the –o command line option. The following command creates an output file with name hello.

gcc hello.c -o hello

As you may have noted, the above commands do both the compiling and linking processes in a single step. If you don't want to link the program, then simple compilation of hello.c file can be done with the following command. The output file will be hello.o and will contain object code.

gcc –c hello.c

Note that both –c and –o command line switches can be used simultaneously. The following command compiles hello.c and produces an output file test.o which is not yet linked.

gcc –c hello.c  -o test.o

Usually when you compile many files in a project, you don't create a.out files. Either you compile many files into object files and then link them together into an application or make executables with the same name as the source code file.

3.3.2 Default File Types

GCC can recognize an input file by the last part of its name, sometimes called an extension. Table 3-1 shows file types and the extensions used with them. Depending upon a particular extension, gcc takes appropriate action to build the output file.

Table 3-1. File types used with GCC

File Extension

File Type


C source code file.


C++ source code file.


C++ source code file.


C++ source code file.


C++ source code file.


C++ source code file.


C++ source code file.


Objective C source code file.


Fortran source code file.


Fortran source code file.


Fortran source code file.


C header file.


C source code file. GCC does not preprocess it.


C++ source code file. GCC does not preprocess it.


Objective C source code file. GCC does not preprocess it.


Fortran source code file. GCC does not preprocess it.


Fortran source code file. GCC does not preprocess it.


Fortran source code file. GCC does not preprocess it.


Assembler code. GCC does not preprocess it.


Assembler file.

This means that if you use command gcc hello.c, GCC will consider hello.c as a C program and will invoke appropriate helper programs to build the output. However, if you use gcc hello.cpp command, GCC will consider hello.cpp as a C++ program and will compile it accordingly. You can also select a language type with a particular file using –x command line option. Table 3-2 lists languages that can be selected with this option.

Table 3-2. Selecting languages with –x option.



-x c (lowercase c)

C language selection

-x c++

C++ file

-x objective-c

Objective C

-x assembler

Assembler file

-x f77

Fortran file

-x java

Java language file

Note that –x option applies to all the files that follow until you turn it off using the –x none option on the command line. This is especially important when you use the GNU make utility discussed in Chapter 5.

By default, the object file created by GCC has .o extension, replacing the original extension of the file. You can create output file with a particular name using –o command line option. The following command creates test.o file from hello.c.

gcc –c hello.c –o test.o

3.3.3 Compiling to Intermediate Levels

The compilation process involves many steps like preprocessing, assembling and linking. By default GCC carries out all of these processes and generates executable code as you have seen earlier. However, you can force GCC not to do all of these steps. For example, using the –c command line option, the gcc command will only compile a source code file and will not generate executable object code. As you already know, the following command compiles file hello.c and creates an object file hello.o.

gcc –c hello.c

If you look at the type of the newly created object file using the file command, you can see that this is not a linked file. This is shown in the following command output.

[root@conformix chap-03]# file hello.o
   hello.o: ELF 32-bit LSB relocatable, Intel 80386, version 1, not stripped
   [root@conformix chap-03]# Creating Assembler Code

Using the –S (uppercase S) command line option, you can stop the GCC compiler just before the assembler process. The output is an assembler file with a .s extension. The following command creates an output file hello.s from source code file hello.c.

gcc –S hello.c

If you look at the output file, you can see the assembler code. Contents of the input file are as follows:

#include <stdio.h>
  printf ("Hello world\n");

The output assembler code is shown below:

[root@conformix chap-03]# cat hello.s
   .file   "hello.c"
   .version        "01.01"
   .section        .rodata
   .string "Hello world\n"
   .align 4
   .globl main
   .type    main,@function
   pushl   %ebp
   movl    %esp, %ebp
   subl    $8, %esp
   subl    $12, %esp
   pushl   $.LC0
   call    printf
   addl    $16, %esp
   .size    main,.Lfe1-main
   .ident  "GCC: (GNU) 2.96 20000731 (Red Hat Linux 7.1 2.96-81)"
   [root@conformix chap-03]#

This assembler code may be used with some assembler, like GNU as. Here as is not word “as” but name of the GNU Assembler which is often written as GNU as as, later on. It can also be assembled and linked to create and execute. Please note that for the above command, the compiler that is included in RedHat distribution was used.

3.3.4 Compilation with Debug Support

If you want to debug a program after compiling with gcc, you have to include debug information in the compiled program. The debug information is included in object file using the –g command line switch with gcc. The following command creates the hello.o file that contains debug information.

[rr@conformix 4]$ gcc -g -c hello.c
   [rr@conformix 4]$

Note that when you compile a program with debug information, the size may be quite large as compared to a file compiled without debug information. In the example program of hello.c, the size of hello.o is 908 bytes when compiled without debug information. The size of hello.o is 10780 bytes when it is compiled with debug information.

You can use multiple debug levels with –g option. The default debug level is 2 which is equivalent to using –g2 command line option. If you use –g3 command line option, information about macros is also included which makes it easier to debug macros.

You can use the debug option with optimization options. Optimization options are discussed later in this chapter.

With the –a option on the command line, you can also include some profiling information in the object code.

You can also use some command line switches to provide extra information. For example, one useful thing is to print out a list of directories that the gcc command searches to find files. The following command will print all directories that gcc uses to search libraries, programs and so on.

[rr@conformix 4]$ gcc -print-search-dirs hello.c -o hello
   install: /usr/lib/gcc-lib/i386-redhat-linux/2.96/
   programs: =/usr/lib/gcc-lib/i386-redhat-linux/2.96/:/usr/lib/gcc-lib/i386-redhat-linux/2.
   libraries: =/usr/lib/gcc-lib/i386-redhat-linux/2.96/:/usr/lib/gcc/i386-redhat-linux/2.96/:
   [rr@conformix 4]$

Here I have used GCC version 2.96 that came with RedHat Linux 7.1 and you can see directories and references to this version information also.

You can also find the amount of time taken by each process during compilation. The following command displays time taken during each step of building the output file.

[rr@conformix 4]$ gcc -time hello.c -o hello
   # cpp0 0.06 0.00
   # cc1 0.08 0.01
   # as 0.02 0.00
   # collect2 0.12 0.03
   [rr@conformix 4]$

It is also evident from the output of the above command that GCC has used four other programs (cpp0, cc1, as and collect2) during the compilation process.

3.3.5 Compilation with Optimization

The first objective of a compiler is to generate output code swiftly. The compiler does not do any code optimization to make the compile time short. However you can instruct gcc to compile code with code optimization. This is done using –O (uppercase O, not zero) on the command line. Different optimization levels can be designated by using a number suffix with this option. For example, -O2 will do code optimization at level 2. If you specifically don't want to do any code optimization, you can use zero with option as –O0.

So what does optimization mean? Consider the following C source code file sum.c that calculates the sum of two numbers and prints the result. Of course this is not the best code for this purpose and it is used only to demonstrate a point.

 1  #include <stdio.h>
 2  main ()
 3  {
 4    int a, b, sum;
 6    a=4;
 7    b=3;
 8    sum = a+b;
10    printf("The sum is: %d\n", sum);
11  }

If you compile this program without any optimization, the compiler will generate code for all lines starting from line number 6 to line number 10. This can be verified by loading the file in a debugger and tracing through it. However, if you optimize the compilation process, lines 6 to 10 can be replaced by a single line as shown below. This can be done without affecting the output of the program.

printf("The sum is: 7\n", );

This is because the compiler can easily determine that all of the variables are static and there is no need to assign values and then calculate the sum at the run time. All of this can be done at the compile time. You can also verify this fact in a debugger. The optimized code will skip over assignment lines (lines 6 to 8) and will directly jump to the printf statement when you step through.

However in the following code, the compiler can't make such decisions because the numbers a and b are entered interactively.

 1  #include <stdio.h>
 2  main ()
 3  {
 4    int a, b, sum;
 6    printf("Enter first number: ");
 7    scanf("%d", &a);
 8    printf("Enter second number: ");
 9    scanf("%d", &b);
11    sum = a+b;
13    printf("The sum is: %d\n", sum);
14  }

If you compile this code with different levels of optimization (e.g., –O1 and –O2), and then trace it through a debugger, you will see a difference in execution sequence because of the way the compiler makes decisions at the compile time.

It may be mentioned that optimization is not always beneficial. For example, code optimization changes timings or clock cycles when the code is executed. This especially may create some problems on embedded systems if you have debugged your code by compiling without optimization. The rule of thumb is that you should create optimized code instead of relying on the compiler to make optimization for you.

For a detailed list of optimization options, please see all options starting with –f command line option. However options starting with –O are the most commonly used in the optimization process.

3.3.6 Static and Dynamic Linking

A compiler can generate static or dynamic code depending upon how you proceed with the linking process. If you create static object code, the output files are larger but they can be used as stand-alone binaries. This means that you can copy an executable file to another system and it does not depend on shared libraries when it is executed. On the other hand, if you chose dynamic linking, the final executable code is much smaller but it depends heavily upon shared libraries. If you copy the final executable program to another system, you have to make sure that the shared libraries are also present on the system where your application is executed. Please note that version inconsistencies in dynamic libraries can also cause problems.

To create static binaries, you have to use –static command line option with gcc. To created dynamically linked output binary files, use –shared on the command line.

For example, if we compile the hello.c program used earlier in this chapter with shared libraries, size of the output executable file is 13644 bytes (this can be further reduced using the strip utility discussed later in Chapter 7 of this book). However, if you compile it statically, the size of the output binary file is 1625261 bytes, which is very large compared to the shared binary. Note that this size can also be reduced using the strip utility.

To identify the dependencies of a dynamically linked binary file, you can use the ldd command. The following command shows that linked output file hello depends upon two dynamic libraries.

[rr@conformix 4]$ ldd hello
   libc.so.6 => /lib/i686/libc.so.6 (0x4002c000)
   /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
   [rr@conformix 4]$

If you copy hello to some other host, you also need to make sure that libc.so.6 and ld-linux.so.2 exist on the target system.

On most of the Linux systems, dynamic linking is done by default.

3.3.7 Compiling Source Code for Other Languages

As mentioned earlier, the GCC set of compilers supports many languages. It can be used to compile programs other than C language. Following is an introduction to compiling programs from other languages. Compiling C++ Code

C++ source code files have suffixes such as .C, .cpp, .cc, .c++, .cxx or .cp. The gcc compiler recognizes these extensions and can compile C++ code. However you can also use g++ or c++ compilers, which are part of the GCC compilers family and are installed with it. These programs invoke gcc with appropriate options to compile C++ code and location of class files. Using these programs, you can also compile C++ source code files that don't have the standard suffixes listed earlier. Compiling Objective C Code

Objective files have suffixes such as .m and gcc recognizes Objective C files with that suffix. When you compile Objective C code, you have to pass an option to the linker. This is done using –lobjc. By this option, the linker uses Objective C libraries during the linking process. Consider the following sample Objective C code (stored in hello.m file) to print “Hello World” on the standard output.

#include "objc/Object.h"

@interface HelloWorld : Object
  STR msg;

+ new;
- print;
- setMessage: (STR) str;


@implementation HelloWorld

+ new
  self = [super new];
  [self setMessage : ""];
  return self;

- print
  printf("%s\n", msg);
  return self;

- setMessage: (STR) str
  msg = str;
  return self;


int main(int argc, char**argv) {
  id msg;

  msg = [HelloWorld new];

  [msg setMessage: "Hello World"] ;
  [msg print];
  return 0;

You can compile and link it using the gcc hello.m –lobjc command. The output is again a.out file that can be executed on the command line.

This is sort of a long “Hello World” program. There are much shorter Objective C “Hello World” programs available on the Internet. Compiling Java Code

Information about the GCC Java compiler gcj is available at http://gcc.gnu.org/java/. Before you can build Java programs, you also need to have libgcj installed. With old compilers, you had to install it separately from source code. When you build and install new versions of GCC, libgcj is installed automatically. If you are still using an old compiler and want to have libgcj installed, information is available at http://gcc.gnu.org/java/libgcj2.html. Briefly the process is as follows:

Download libgcj from ftp://sourceware.cygnus.com/pub/java/ or another web site on the Internet. Untar it in /opt directory and a new directory will be created under /opt which will contain source code for libgcj. Create a directory /opt/libgcj-build and move into this directory. After that you have to perform the following sequence of commands:

  • ../libgcj/configure

  • make

  • make install

Note that your new compiler must be in PATH before you build libgcj.

Now let us see how to compile a Java program. Consider the following simple Java program that prints the message “Hello World”. The source code filename is hello.java.

class HelloWorld {
  public static void main (String args[]) {
    System.out.print("Hello World ");

Traditionally you have to invoke the javac program to build a byte code. After that you have to run the byte code using the java program on Linux. However if you use gcj, you can create a binary output file hello using the following command:

gcj –main=HelloWorld –o hello hello.java

The output file is hello, which is an executable binary. The –main switch is used for the entry point when the program is executed.

The compiler uses some information to build Java code. This information includes reading the gcj specification file and libraries. The following command displays this information.

[rr@conformix 4]$ gcj -v
   Reading specs from /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/specs
   Reading specs from /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/../../../libgcj.spec
   rename spec lib to liborig
   rename spec startfile to startfileorig
   Configured with: ../gcc-3.0.4/configure --prefix=/opt/gcc-3.0.4 --enable-threads=posix
   Thread model: posix
   gcc version 3.0.4
   [rr@conformix 4]$

The compilation of Java programs is completed in many steps. Let us compile the hello.java program to build a statically linked hello output binary using the following command. The –v switch shows all of the steps during this process.

[rr@conformix 4]$ gcj hello.java --main=HelloWorld  -o hello -static -v
   Reading specs from /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/specs
   Reading specs from /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/../../../libgcj.spec
   rename spec lib to liborig
   rename spec startfile to startfileorig
   Configured with: ../gcc-3.0.4/configure --prefix=/opt/gcc-3.0.4 --enable-threads=posix
   Thread model: posix
   gcc version 3.0.4
   /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/jc1 hello.java 
   -fuse-divide-subroutine -fuse-boehm-gc -fnon-call-exceptions -quiet -dumpbase hello.java 
   -g1 -version -o /tmp/ccHj5WMY.s
   GNU Java version 3.0.4 (i686-pc-linux-gnu)
   compiled by GNU C version 3.0.4.
   as --traditional-format -V -Qy -o /tmp/cchm92Nc.o /tmp/ccHj5WMY.s
   GNU assembler version 2.10.91 (i386-redhat-linux) using BFD version
   /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/jvgenmain HelloWorldmain /tmp/
   /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/cc1 /tmp/ccTlFcXz.i -quiet -dumpbase 
   HelloWorldmain.c -g1 -version -fdollars-in-identifiers -o /tmp/ccHj5WMY.s
   GNU CPP version 3.0.4 (cpplib) (i386 Linux/ELF)
   GNU C version 3.0.4 (i686-pc-linux-gnu)
   compiled by GNU C version 3.0.4.
   as --traditional-format -V -Qy -o /tmp/ccBgJjpa.o /tmp/ccHj5WMY.s
   GNU assembler version 2.10.91 (i386-redhat-linux) using BFD version
   /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/collect2 -m elf_i386 -static -o hello 
   /usr/lib/crt1.o /usr/lib/crti.o /opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4/
   crtbegin.o -L/opt/gcc-3.0.4/lib/gcc-lib/i686-pc-linux-gnu/3.0.4 -L/opt/gcc-3.0.4/lib/
   gcc-lib/i686-pc-linux-gnu/3.0.4/../../.. /tmp/ccBgJjpa.o /tmp/cchm92Nc.o -lgcc -lgcj -lm 
   -lgcjgc -lpthread -lzgcj -ldl -lgcc -lc -lgcc /opt/gcc-3.0.4/lib/gcc-lib/
   i686-pc-linux-gnu/3.0.4/crtend.o /usr/lib/crtn.o
   [rr@conformix 4]$

As you can see, different programs have been executed to get the output binary file. These programs include:

  • The jcl program

  • GNU assembler as. Again as is the name of the assembler. as

  • The jvgenmain program

  • The cc1 compiler

  • The collect2 program

You can also see various command line switches used with these programs.

3.3.8 Summary of gcc Options

Hundreds of options can be used with gcc on command line. Explanation of all of these options is beyond the scope of this book. However, following is a summary list of these options as displayed by gcc man page (using man gcc command). Options are grouped in different sections which will be helpful for you if you are looking for options related to a particular task. Overall Options

-o file
-x language
--help C Language Options

-aux-info filename

-fshort-wchar C++ Language Options


-Wsynth Objective-C Language Options

-Wselector Language Independent Options

-fdiagnostics-show-location=[once|every-line] Warning Options


-Wwrite-strings C-only Warning Options

-Wtraditional Debugging Options


-ftime-report -g
-time Optimization Options

-fdce -fdelayed-branch






-funroll-loops --param name=value -O
-Os Preprocessor Options

-idirafter dir
-include file
-imacros file
-iprefix file
-iwithprefix dir
-iwithprefixbefore dir
-isystem dir
-Wp,option Assembler Option

-Wa,option Linker Options

-Xlinker option
 -u symbol Directory Options

-specs=file Target Options

-b machine
-V version Machine Dependent Options

M680x0 Options


M68hc1x Options


VAX Options


SPARC Options



Convex Options


AMD29K Options



ARM Options


MN10200 Options


MN10300 Options


M32R/D Options

-msdata=sdata-type -G num

M88K Options



RS/6000 and PowerPC Options





-mvxworks –G num

RT Options


MIPS Options


-G num
-mabi=eabi -mfix7000

i386 Options



HPPA Options


Intel 960 Options


DEC Alpha Options



Clipper Options


H8/300 Options


SH Options


System V Options


ARC Options





TMS320C3x/C4x Options

-mdp-isr-reload -mrpts=count

V850 Options



NS32K Options


AVR Options



MCore Options


IA-64 Options



S/390 and zSeries Options


Xtensa Options

-mno-long-calls Code Generation Options




Next Section 4. Linking a program
© 2003 Pearson Education, Inc. InformIT Division. All rights reserved.
800 East 96th Street Indianapolis, Indiana 46240