07 Nov

What exactly does GCC optimize?

All compilers optimize code to some extent, some better than others. However, at least to me a great amount of the nature of the optimizations is unclear, outside the explicitly stated loop unrolling and like. Let’s find out which kind of programmer laziness and/or stupidity gets eliminated.

Setup

I used GCC 4.2.2 (MinGW) to compile the test program. The following options produce an assembly source of the compiled binary juxtaposed to the C source:

gcc -c -Wa,-a,-ad opt.c > opt.lst

The tests mostly check if the compiler realizes a function always returns a certain value (opt1(), opt2()), if a function call can be completely eliminated by looking at the input parameter (opt5()) and so on. While compilers do a good job at saving CPU cycles by using registers to pass parameters instead of the stack and so on, the question here is how they manage with lazy code and if you can trust the compiler to take care of some obvious optimizations and shortcuts.

Tests

Each of the following test case is called by the following line (to ensure the optimizer doesn’t just skip execution etc.):

printf("%d %d %d %d %d %d\n",opt1(),opt2(),opt3(),opt4(),opt5(0),opt5(1));

opt1()
int opt1() {
	return 6+7;	
}

The following of course always returns 13. We assume the compiler should always substitute function calls with the result.

opt2()
int opt2() {
	int x=rand();
	int y=rand();
	return x*y-x*y;   // 1-1=0
}

Always returns a zero. The function call can’t be eliminated because even if we know the result, two calls to rand() are needed. The two substitutions and multiplications can still be omitted.

opt3()
int opt3() {
	int x=rand();
	int y=rand();
	return ((x*y)<10000)?(x):(x*y);	
}

Will the compiler see it doesn’t have to calculate x*y twice?

opt4()
int opt4() {
	if (opt3()) {
		rand();	
	} else {
		rand();	
	}
	
	if (opt2()) {
                // should never get here!
		return opt3(); 
	} else {
		return 5+opt3();	
	}
}

Tests if the compiler leaves out blocks that will never be on the code path (opt2() always returns zero) and if the else-clause yields to same code as the if-clause.

opt5()
int opt5(int x) {
	return x*opt1();	
}

Does the compiler substitute the function call with the result? If x equals to zero, the result will always be zero.

Results

GCC 4.2.2 -O2 GCC 4.2.2 -O3
opt1() Call Result substituted, no call
opt2() Call, no calc, returns zero Result substituted, no call
opt3() Call, remembers x*y Inlined
opt4() Call, forgets opt2() always returns zero, although

if (opt3()) {
  rand();
} else {
  rand();
}

correctly becomes

opt3();
rand();
Inlined, all excess code eliminated
opt5() Call, calls opt1(). No optimizations Inlined, substituted

Analysis

As you can see from the results, -O2 isn’t too smart. Many clear cases such as the one with opt1() go unnoticed. I would assume a typical method that returns the value of a private member also has to be called instead of just accessed directly, so beware (note: I didn’t test g++ but as you know, it simply produces wrapper code for GCC).

-O3 on the other hand is a much better companion to a lazy coder, you can trust it eliminates downright crap code you don’t feel like cleaning manually. With -O3 you pretty much can use inlined functions instead of macros with no penalty (as long as you set the inline limits et cetera).

14 May

-mthreads, -mthreads, -mthreads…

This falls in the obvious category but I’m posting this just that someone will see this even by accident: always use -mthreads when using threads under MinGW. I repeat (so that I remember this in future): always use -mthreads when using threads under MinGW. I just spent hours using gdb instead of -mthreads.

I.e.:

g++ -mthreads -o game.exe game.c

The mentioned option makes MinGW use thread-safe stuff internally. It is not enough a library claims to be thread-safe (well, actually it probably claims to be thread-safe as long as the runtime providing malloc et al is safe, which indeed is not the case when not using -mthreads).

At least Viewer2 is fast now.

29 Apr

Automatic build numbers

I needed a program that would keep track of a single number, increase it on demand and output the number to the standard output. This is because I don’t really like using any IDEs (at least anything for MinGW) and so I don’t have all the modern luxuries like automatic build numbers (a sequential number that increases every time the program gets built). I have only an editor, a compiler, and make. I have included the packaging of the releases in the makefile (i.e. make & make installer). It’s eventually confusing to name all of your installers with the same filename, so that’s what I need the build number for.

The program I wrote, called bn, stores the build number in a single-line file. bn filename increases the number in filename and outputs the result. If you add noinc after the filename only the current number is output. In the makefile, add bn buildnumber in the rule that builds your executable:

program.exe: program.c
  gcc program.c -o program.exe
  bn buildnumber

Every time program.exe gets built, the number in the file buildnumber increases by one. I use the limited command line interface in Windows but the make I use has the more capable shell sh built-in, so I can have a rule like this (cp -f is the same as copy /y in sh):

installer: program.exe
  makensis program.nsi   # outputs the installer as installer.exe
  cp -f installer.exe installer-build-`bn buildnumber noinc`.exe

In sh, everything that is between two grave accents (`) gets treated as a separate command as you would type it in. However, this makes sh substitute bn buildnumber noinc with whatever the command prints in stdout. So, effectively the line in the makefile is seen as (assuming the build number is 1000):

  cp -f installer.exe installer-build-1000.exe

Note: You have to manually create a file named buildnumber. Just save a text file with a single line with the number that you want to start counting from. This is so that you don’t accidentally overwrite something.

Scratch one tedious manual task per release! You can also use this to inject numbers in the actual executable or keep track of times a program is run (by using a batch file).

Download the program with the source code (in C)