Programming | kometbomb

11 Nov

Compare text files in SQL

Here’s a quick hack that allows loading text files as tables in a sqlite database. Why? It’s pretty nice to compare files in SQL.

txtsql can be run as follows: txtsql.exe file.txt file2.txt ...

Every text file is loaded as a table with two columns: row (line number) and data (the actual text). When saved with save and the first column is named row, it will be skipped. Thus, every loaded table and ever table created with new will be saved as a normal text file. Any other table will be saved with tab characters separating the cells.

The program is just a test to see if this is a viable concept (post a comment if you think it is). If it seems to be something to develop further, I’ll add something like transparent loading and saving of text files as tables when they are invoked (now you have to load them with the load command), common text operations (fuzzy searching etc.) and such.

Examples

Count lines:

load `file.txt`
select count(*) from `file.txt`

Save unique lines in a new file:

load `file.txt`
new `uniq.txt`
insert into `uniq.txt` (data) select distinct data from `file.txt`
save `uniq.txt`

Find similar lines:

load `1.txt`
load `2.txt`
select data from `1.txt` where data in (select data from `2.txt`)

txtsql.zip – Binaries (Win32) and source code

07 Nov

What exactly does GCC optimize?

All compilers optimize code to some extent, some better than others. However, at least to me a great amount of the nature of the optimizations is unclear, outside the explicitly stated loop unrolling and like. Let’s find out which kind of programmer laziness and/or stupidity gets eliminated.

Setup

I used GCC 4.2.2 (MinGW) to compile the test program. The following options produce an assembly source of the compiled binary juxtaposed to the C source:

gcc -c -Wa,-a,-ad opt.c > opt.lst

The tests mostly check if the compiler realizes a function always returns a certain value (opt1(), opt2()), if a function call can be completely eliminated by looking at the input parameter (opt5()) and so on. While compilers do a good job at saving CPU cycles by using registers to pass parameters instead of the stack and so on, the question here is how they manage with lazy code and if you can trust the compiler to take care of some obvious optimizations and shortcuts.

Tests

Each of the following test case is called by the following line (to ensure the optimizer doesn’t just skip execution etc.):

printf("%d %d %d %d %d %d\n",opt1(),opt2(),opt3(),opt4(),opt5(0),opt5(1));

opt1()

int opt1() {
	return 6+7;	
}

The following of course always returns 13. We assume the compiler should always substitute function calls with the result.

opt2()

int opt2() {
	int x=rand();
	int y=rand();
	return x*y-x*y;   // 1-1=0
}

Always returns a zero. The function call can’t be eliminated because even if we know the result, two calls to rand() are needed. The two substitutions and multiplications can still be omitted.

opt3()

int opt3() {
	int x=rand();
	int y=rand();
	return ((x*y)<10000)?(x):(x*y);	
}

Will the compiler see it doesn’t have to calculate x*y twice?

opt4()

int opt4() {
	if (opt3()) {
		rand();	
	} else {
		rand();	
	}
	
	if (opt2()) {
                // should never get here!
		return opt3(); 
	} else {
		return 5+opt3();	
	}
}

Tests if the compiler leaves out blocks that will never be on the code path (opt2() always returns zero) and if the else-clause yields to same code as the if-clause.

opt5()

int opt5(int x) {
	return x*opt1();	
}

Does the compiler substitute the function call with the result? If x equals to zero, the result will always be zero.

Results

	GCC 4.2.2 `-O2`	GCC 4.2.2 `-O3`
opt1()	Call	Result substituted, no call
opt2()	Call, no calc, returns zero	Result substituted, no call
opt3()	Call, remembers `x*y`	Inlined
opt4()	Call, forgets `opt2()` always returns zero, although `if (opt3()) { rand(); } else { rand(); }` correctly becomes `opt3(); rand();`	Inlined, all excess code eliminated
opt5()	Call, calls `opt1()`. No optimizations	Inlined, substituted

Analysis

As you can see from the results, -O2 isn’t too smart. Many clear cases such as the one with opt1() go unnoticed. I would assume a typical method that returns the value of a private member also has to be called instead of just accessed directly, so beware (note: I didn’t test g++ but as you know, it simply produces wrapper code for GCC).

-O3 on the other hand is a much better companion to a lazy coder, you can trust it eliminates downright crap code you don’t feel like cleaning manually. With -O3 you pretty much can use inlined functions instead of macros with no penalty (as long as you set the inline limits et cetera).

22 Oct

There’s Plenty of Room at the Bottom

This is the first part of the epic (two-part) series of articles about tiny intros, the next one will be about an actual intro.

I love tiny graphical presentations called “intros” in the demoscene and by tiny I mean 256 bytes tiny. Usually, at this point people mention a paragraph of text is well over 256 bytes worth of data. However, you should not think that computer needs the same amount of data to store information such as “multiply variable X by variable Y” as we do. That textual representation needs 34 bytes of storage, less if you write it using shorter words or simply an equation. What a processor needs would be something like 3 instructions, 2-4 bytes each.

However, my point is that 256 bytes is ridiculously little for most people (even though you can hardly impress a mouth breather playing Halo 3 by stuffing amazing things in that space). At those sizes the source code is many times bigger than the resulting program. On the other hand, 256 bytes is plenty of space for those who are crazy enough to write such things. Why executables tend to be much, much bigger than that is because there is a lot of stuff in there what the programmer didn’t put in there himself/herself.

The Code

An obvious source of bloat in the executable is the fact programs are written in high-level languages such as C++ and then compiled into code that the processor can run. Compilers generally do a good job at producing working executables with fast code and they don’t even try to save space. Add tons of modern features – object polymorphism, runtime error checking et cetera – into that and you have your normal 100 kilobytes of code for the smallest program that necessarily doesn’t even do anything but exit back to desktop. On top of that, the executable file also houses information such as which libraries to use, the program icon and so on.

One way around the problem above – and it is a problem only if you for some reason are decided to spend you time doing something very few find nice – is to write the programs in more condensed languages. For example, C produces smaller code than C++ (thanks to the fact there is minimal extra fluff in C) and writing everything in assembly language produces pretty much as small code as it is possible to write because it basically is the human-readable version of machine language. A reminder: those modern languages (and C isn’t even modern as it was conceived in the 1970s) were created because writing complex things in assembly is not everyone’s cup of tea – that makes writing very tiny stuff in assembly not-everyone’s-cup-of-tea squared.

“rubba_b” by kometbomb…

… and the same for Windows – 254 bytes vs. 9278 bytes.

Another layer of arcane computing that makes a 256-byte executable possible is the use of old operating systems such as DOS. As you read above, the executable contains information about libraries and such, and back in the 80s hardly any shared libraries were used at all anyway. So, an executable pretty much was only the compiled program code and that was it. No extra fluff. Using this old executable format and assembly language the programmer essentially is the only factor that has any effect in the size of the executable.

Ironically, while old operating systems with no libraries do not provide any modern conveniences like fast 3D graphics or a mentionable interface to anything at all, this is a blessing. You can access the hardware directly which means a lot less code to do things like drawing a pixel on the screen. You don’t even have to set up those libraries because there isn’t any. As an example, the same intro (albeit converted to C code from assembly) in Windows XP is about 10 kilobytes in size, while the original version for DOS is less than 256 bytes. And, it would be even larger in Windows if it used the hardware to do the things it does in software.

The Data

Yet another reason why things like computer games or even larger intros are huge compared to the tiniest intros is because they come with a lot of precreated data like graphics and music. For 256-byte intros even a precreated flightpath for the camera is pretty much a no-go. Or, a 3D object with, say, more than four faces. That means we have to use procedurally generated data which is in the vogue right now even outside size restrictions, with things like Spore.

Actually, in many new school 256-byte intros the data is not even precalculated but simply calculated while the intro draws it on the screen. This is to save the space you would need if you had both the drawing loop and the precalculation loop (remember, even the loop structure uses precious bytes) – thanks to the fact even slow things are fast with a relatively new computer.

Speaking of slow things, one of them is raytracing. Which is purely mathematical, which means very little precreated material. Raytracing also is one way to produce realistic three-dimensional images, which as a rule of thumb, look nice. So, it’s not a surprise that a modern 256-byte intro uses raytracing, or raycasting, to simulate nature, look amazing and still fit in the specified size limit. If it’s 256 bytes and 3D, it probably uses some blend of raycasting. Older intros usually were versions of familiar 2D effects such as the rotozoomer, the fire effect, the Mandelbrot set and other well-explored routines.

The Art

Even among demo afficianados there is some controversy whether the tiniest intros have any merit as art or if they simply are a proof of concept of how small you can make something. There is no hidden gold vein as in nanotechnology in there. While most tiny intros are for the most part coder porn, nobody can deny their status as art installations. And if you stay in the 128-256 byte range, most of the popular tiny intros are quite pretty too, especially considering all the compromises that had to be made to make it fit.

The ultimate irony in downplaying the merits of the intros is that the demoscene pretty much was born from people pushing the machine to the limits. Well, nowadays to create a great demo (think of it as a large intro with many parts) you have to push your artistic talent more than the limits of the machines. It’s more like creating a music video, though, with your own tools. Seems like the final limit is the man-made limit of the smallest size you can fit something in and still call it art with a clean conscience.

15 Sep

C masochism, part 1

I like C. I write tools in C that sane people write in Ruby or whatever the trendy script-like loosely typed language of the week is. But this is not about which language to choose. This is about doing stuff in C just because a lot of people think it’s 1.) limited, 2.) hard and 3.) unnecessarily old to be used in the 2000s.

I like C. I like how a compiled executable is usually in the 10 kilobyte range for simple programs. I like how you don’t need anything besides the executable. I like how I can’t blame but myself if the program is too slow.

But a significant amount of the reasoning why I decide to use C over and over again is the challenge. I recently read a blog entry about how programming can ruin your life, which I was ready to judge as another lame blog post about how programming is wizardry and how programmers see everything differently (you know, a bit how in the end of Matrix (1999) Neo sees that green Matrix code everywhere). Though, I have to agree the post has a good point: I too tend to make even purely practical things challenging, e.g. by using vanilla C.

Back to the point. I was writing this tool for downloading files using curl (the command line tool, not the library). I also use FlashGot for Firefox which saves all the selected URLs in a text file and runs the program specifying the text file on the command line. Parsing the text file is simple enough in C:

int main(int argc,char **argv) {
  FILE *f=fopen(argv[1],"r");
  char line[100];

  while (fgets(line,99,f)) {
    download_url_with_curl(line);
  }

  fclose(f);
  return 0;
}

However, I don’t like how the program is dependent of the text file (note how at this point the additional challenges start to sneak in). The text file is open as long as the program keeps downloading the files and that can be a long while if the files are huge. Of course, the program should read the list in memory and close the file.

int main(int argc,char **argv) {
  FILE *f=fopen(argv[1],"r");
  char line[100][100];
  int num_lines=0,i;

  while (fgets(line[num_lines++],99,f)) {
    /* line read */
  }

  fclose(f);

  for(i=0;i<num_lines;i++) download_url_with_curl(line[i]);

  return 0;
}

Still quite simple, but there are two problems: What if there are more than 100 lines? What if the lines are longer than 100 characters (including the null terminator)? The array at least should be dynamic, i.e. it should be able to resize itself when lines are read. This is still quite simple but it starts to get annoying:

int main(int argc,char **argv) {
  FILE *f=fopen(argv[1],"r");
  char **lines=NULL,line[100];
  int num_lines=0,i;

  while (fgets(line,99,f)) {
    lines=realloc(lines,(num_lines+1)*sizeof(char*));
    lines[num_lines++]=strdup(line);
  }

  fclose(f);

  for(i=0;i<num_lines;i++) {
    download_url_with_curl(line[i]);
  }

  for(i=0;i<num_lines;i++) {
    free(lines[i]); // free the allocated strings
  }

  free(lines); // free the array
  return 0;
}

You have to reallocate memory a lot and you have to free the memory. Not perverted enough. It seems I have run out of real improvements to make and have to find something else challenging.

I figured I could use the stack to have a kind of dynamic array by using recursion: if every recursion allocates a bit of the stack to store one array item, it would automatically allocate and free the memory when the functions return:

void parse_line(FILE *f) {
  char *line=malloc(sizeof(char)*100);

  if (fgets(line,99,f)) {
    parse_line(f);
    download_url_with_curl(line);
  } else {
    fclose(f);
  }

  free(line);
}

int main(int argc,char **argv) {
  parse_line(fopen(argv[1],"r"));
  return 0;
}

This basically works as follows: the parse_file() function will recurse until all of the file is read, closes the file and then it starts to call the download_url_with_curl() function as every function returns (take note it will do everything backwards, which doesn’t matter when downloading files).

The stack will overflow at some point but it should handle ~50000 or whatever amount of recursions – there are only the saved registers and the single pointer to the string in the stack per array item. An acceptable tradeoff when you think of the code being actually about as complex as the first incarnation with the fixed array size. Obviously, the people who bitch and moan about pointers being evil and how C doesn’t have any dynamic conveniences haven’t ever considered this elegant method.

04 Sep

Image retargeting

Note: See below for the updated version.

I saw this video of a SIGGRAPH paper about image retargeting (high res version here, read the paper here), that is rescaling an image so that the algorithm keeps the interesting areas intact and doesn’t squash everything. It’s called seam carving in the paper.

The video made it look amazingly simple (and actually explained the whole idea much better than most papers manage to do), so obviously I had to try my hands at it. After about three hours worth of coding I came up with my version (you can find the source code below!).


Original image	Retargeted image	Retargeted image

Notice how the guy’s face and the cloud stay the same even if everything else is stuffed in the smaller image area.


Original image	Retargeted image

Again, the higher contrast areas (i.e. the man and the dogs, black on white) are kept the same while the snowy area is made narrower.


It’s a small world… ;)

I didn’t read the SIGGRAPH paper, so I don’t know what makes their algorithm work that well (or maybe they just chose the right images for the video). My program works as follows (when shrinking the image horizontally):

For each column, traverse from top to bottom picking any of the three (or more) neighboring pixels below the current pixel
Calculate the “penalty” or error, i.e. try to pick the neighboring pixel that is colored as similarly as possible compared to the one next to it (in the direction we want to shrink the image)
From these paths, pick the path that has the lowest penalty and crop the pixels along the path, while moving the rows to the left, as you would delete characters in a text
Repeat until the image width is what was requested

In all, this is very slow but it could be made faster (as in the video that shows realtime scaling) if the penalty or error values were precalculated for each pixel. The algorithm should also try to pick paths that are further apart, so it would remove pixels more evenly and it should backtrack when trying to find the optimal path. Now it just goes along a “wall”, i.e. a high-contrast area when it finds one – it should backtrack and try to find a path further away. Finally, there should be a feature that allowed the user to mark faces and other areas that should never be scaled.

To use the program, you need to run it from command line or drag a 24-bit BMP image on the icon. Resize the window to scale images. If you want to save the image, simply answer “yes” when exiting the program.

New version

When using the new version, you can resize to a specific size (as requested) by running the program as follows:

retarget image.bmp 800 600

This will try to resize the image to 800×600 resolution. The new version is able to load JPEG, PNG, BMP and probably some other formats too (thanks to the SDL_image library). Note that it still will save as BMP, even if the extension is JPG or so.

Use the left mouse button to mark areas such as faces, eyes and so on, and the right mouse button to mark areas that you want to remove. Middle mouse button erases the marks. To tweak the blur amount (less is better for cartoon style images and maps, the opposite for photos), run it like this:

retarget image.bmp 800 600 4

Now there will be twice as much blur as usually (default is 2).

retarget3.zip – the program with the source code (you need SDL and SDL_image)

Here’s the original version, it is still useful:

retarget2.zip – the program with the source code (you need SDL)

kometbomb

Programming and stuff.