03 Mar

Review: Starball

I used to read ST Format back when I had an Atari ST. Later in its existence the magazine started to include better and more complete games and other software on its cover disk, probably because of the Atari was dying and publishers decided to give away their assets (or more like the actual developers were able to secure rights to the software from the publishers and then give it away so the work wouldn’t be in vain.) Also, back in the day shareware meant you got the full software to play with and only then you had to decide whether to pay or not to pay for it — it wasn’t uncommon for that to be actually profitable. Whatever the reason, I wouldn’t have probably ever heard of some pretty awesome games.

Starball was one of those games.

On the Amiga, there were a few excellent pinball games by Digital Illusions, including Pinball Dreams and Pinball Fantasies. On the Atari ST, there were practically no modern pinball games apart from the STE title Obsession which was most likely created to cash in thanks to the surging interest in pinball games thanks to DI and also because the STE was capable to give the same smoothness as on the Amiga. But for the plain vanilla ST users, there still were none. At least until Starball, that is.

Starball is a modern pinball game — as in the screen scrolls and it was made in the 1990s — but it’s very different from the look and feel of the relatively realistic pinball games on the Amiga. The gameplay and the table is very similar to the Crush series on the PC Engine (Turbografx-16) which realized a pinball video game doesn’t have to simulate a real pinball table and added common elements from video games in general. While Pinball Dreams had very smooth gameplay based on getting accurate chains of ramp runs, in Starball the ball is used to smash flying spaceships, cultists and Jimmy Hill’s chin. In addition, Starball has three smaller areas with their own set of flippers and even graphical theme and when you miss and the ball will simply fall down, it will only move your one area down. Unless you are already in the bottom area.

In the middle area, you are building a space ship one part at a time and trying to stop turrets destroying the spaceship parts. On top level, you try to crush little guys walking in circles and there’s a slimy face in the center. And the face will get slimier each time you kill all the little guys. And the bottom area has a huge fly-eyed alien and more explosions. The fly alien thing also contributes to the game in that its mouth takes you to a bonus level. There are at least two different bonus levels, in general they are much like the small Mario level in NES Pinball keeping in the overall pinball theme.

While the game is very enjoyable at least as a nostalgy trip, there are some faults. First, the gameplay isn’t nearly as fluid as what the standard set by Dreams was at the time. The flippers feel sluggish and sometimes the ball bounces all weird. However, you can easily adapt to the slight delay in the flipper hit. The table area isn’t terribly interesting in that there are no huge ramps and other stuff the Amiga games did well but the grimy graphics and the self-awareness of the limitations makes the game stand out.

The game can be found on the ST Format 64 cover disk available on this page.

28 Feb

An Alternative to XML

My new weapon of choice. libconfig is a configuration file parser that supports arrays, named and typed members, selection by path (e.g. cfg.users.[3].name) and more. That is, it has basically all the useful (as in 80% of cases) features of XML and none of the bad. There’s a minimal but well defined structure that will work for most situations and that can be used to enforce e.g. all array items having to be of the same type. There’s no overkill markup so it’s easy to read and write by humans. The library can also write the settings tree into a text file.

The configuration files look like this:

screen = { width = 300; height = 200; }
users = ( { name = "Torgo"; items = [ "Item 1", "Item 2" ]; } );

And in C you would do something like this:

int screen_height = 100;
const char *name;
config_init(&cfg);
config_read_file(&cfg, "config");
if (!config_lookup_int(&cfg, "screen.height", &screen_height))
  puts("Using default screen height");
if (config_lookup_string(&cfg, "users.[0].name", &name))
  puts(name);
config_destroy(&cfg);

You can also iterate the setting tree without the path for easier array or tree traversal. In all, I would say it involves less work compared to any XML library, especially in C. I like to think it’s a good example of software designed by the same guy who also uses it and not by some external committee.

31 Dec

Creating a Simple GUI from Scratch

Introduction

More often than not, the biggest obstacle for a programmer is adding a GUI to a otherwisely ready application. This probably is because you generally can create a program that inputs and outputs text with a few lines of code in any language. However, if you don’t count MessageBox() and other similar helper APIs that are available in most windowing systems and that are as simple to use as gets() and printf(), it’s a unnecessarily big step to change a command line program into a program that outputs the exact same text but with complex command line arguments replaced by a few buttons and a window with the outputted text.

However, if you are developing a program that already draws something on the screen, it’s really easy to add simple mouse interactivity. This is especially true if you are using SDL (even if it’s extremely bare bones in these things!) and already use SDL_rect (your standard rectangle) to draw things. You can simply change the draw routine so that it takes one more argument which would be the SDL_event struct you’re already using to check for a quit message and so on. Then for every object you are drawing on screen, check if the event if a mouse button down event and that the coordinates are inside the rectangle that will be drawn. This eliminates completely the need for having to plan a separate system that interprets the mouse events. It’s sort of piggybacking on the existing code with minimal changes to the existing code.

Now, you might already think that is way too simplistic and lazy but think again. Not many programs need a more complex GUI with multiple movable windows and so on. If you really do need that, then be my guest and create a exhaustive windowing system or take the time to learn an existing system. But it is still overhead and you have to do comparatively a lot of work before any real results. At least I hate that kind of mental overhead. And while this whole idea of combining the drawing and the event processing sounds like a bit of a hack, it really isn’t a “hack” as in patching something you will probably have to replace with a better solution later. It’s just a different way of doing almost the same thing. With less overhead.

I have used this approach in my latest project and from direct experience I know it is possible to create scrollbars, scrolling text fields, text input fields, message boxes and pretty much anything I have needed. And it’s not too much code either even if you have to create absolutely everything from nothing. In other words I do not feel limited or burdened.

How to do it

As explained above, you most likely are drawing to a specific region on the screen for every object you need to check for mouse clicks. It is not necessary to have any kind of array where those regions are. You need to make sure that for every mouse click event runs the draw loop once and that the mouse click event gets passed to the draw routine. Then check if the coordinates are inside the draw region.

This is where the part starts that it admittedly gets a bit hacky: if a button is pressed, the event that the button triggers will be run in the middle of the draw loop (unless you somehow buffer the events). In most cases this doesn’t matter at all but it could be that what is on the screen is not exactly how it really is; you might have two selected items for a duration of one frame and so on. For the most part this doesn’t matter since you are getting results with very little overhead.

Consider this example.

void draw_stuff()
{
  SDL_Rect position = {10, 10, 40, 40};

  SDL_BlitRect(button_gfx, NULL, screen, &position);
}

void draw_stuff_and_check_events(SDL_Event *event)
{
  SDL_Rect position = {10, 10, 40, 40};

  if (event->type == SDL_MOUSEBUTTONDOWN 
   && event->button.x >= position.x
   && event->button.y >= position.y  
   && event->button.x < position.x + position.w 
   && event->button.y < position.y + position.h)
     button_event();

  SDL_BlitRect(button_gfx, NULL, screen, &position);
}

void event_loop()
{
  SDL_Event event;

  while (SDL_PollEvent(&event))
  {
    if (event.type == SDL_QUIT) break;

    draw_stuff(&event);
  }
}

From the above example, the benefits of this approach are obvious. The event checking can simply be injected anywhere in the code as long as you have the event and a region.

Solutions to common problems

One downside in this is that if you draw multiple overlapping regions, the region drawn last will be the one that is visible on the screen but the click is handled by the first region. Our event checking sees the regions from the other side of the screen. In such cases you can first iterate the regions in reverse order checking the event and then draw them in the correct order. An example in the mentioned project is the menu, submenus often overlap the parent menu; I solved that by first going through the menus in the order the user sees them and then drawing them back to front.

Dragging items is easy: simply check for mouse motion events and if the button is held down and the mouse is moved, just adjust the position of the matching region. You do not need any special “drag starts now” and “drag ends now, update objects” phases. However, this simplistic method is subject to the previous issue if you move a region over one that seems to be under it. You can also simply record the clicked object when the mouse button is pressed and make the motion events match only the selected region.

FAQ

I absolutely need a separate window

This is possible as long as the window can be modal (like a message or open file dialog that takes over from the window behind it). You simply jump to another event loop until the new window is closed, much like it’s done in the Windows API with GetOpenFileName() or the MessageBox() mentioned earlier. Then it’s just a matter of drawing the new window and checking for the events normally.

Is this reusable?

Of course. You could create a small library that has basic functionality and helper functions. A practical example can be seen here.

Even if you can use the event checking straight in the source code, you can still define the regions in an array and link to relevant event handlers.

Conclusion

Combining the drawing and the event processing code can save time in the short term. Many common GUI elements are perfectly possible to replicate. The idea described above should be considered if a project needs mouse interaction and external libraries are not available, the conversion of existing code seems expensive or the learning curve is too steep compared to the future benefits. Nonetheless, in borderline cases, it can be well worth prototyping due to the minimal overhead and developing considerations.

09 Nov

Plenty of Room, Part II

This is the second part of the epic (two three-part) series of articles about tiny intros, the previous part was an essay about 256-byte intros in general.

Ed. note: Since this article seems to take forever to finish, here’s the first half of it. The (hopefully) final part will detail more of the specifics.

rubba As stated earlier, tiny intros written in assembly language fascinate me. I have written a few in x86 assembly language, here’s one of them. I have tried to make the inner workings of the program as accessible — or, at least as thought-provoking — as possible even if assembly wasn’t their weapon of choice.

I’ve included the original DOS intro (you can use DOSBox to run it, it should also work on Windows XP) and a Win32 port of it, written in C while trying to keep the original structure intact. I’ll also try to explain the general idea of the effect in pseudocode where syntax can be an obstacle. The archive rubba_c.zip contains the source code, rubba_b.exe which is the compiled Win32 executable and RUBBA.COM which is the 16-bit MS-DOS executable. To compile the C program, you need the TinyPTC library (like SDL but even more bare-bones).

I won’t go into details about x86 assembly language, I suggest anyone interested first reads an introduction and learns some of the basics. However, I’ll try to make the code look interesting, explain some weird or obfuscated code and probably show some basic size optimization tricks.

The Effect

The intro, called rubba_b, shows one effect: a twisting bar drawn using primitive light-shaded texture mapping. The color palette and the texture are generated run-time. The texturing is done using linear interpolation and no vector math is used even if the bar looks like it is rotated. The lighting is an extremely simple approximation of the light source being exactly where the camera is located. That is, the length of the textured line directly determines the light.

If looked from above, the bar will be a tower of squares. If one of the squares is rotated around the center, the corners will move in a circular pattern. So, the X coordinate will be equal to cos(corner_number*(360 degrees/4)+square_rotation), the Z coordinate (why Z? Because it goes towards the screen) is equal to the sine but it can be discarded because we will not need it for perspective calculation. Remember, we’re short on bytes.

We then modulate the bar rotation for each square in the tower. If the amount of rotation changes when moving up or down the tower, the bar will twist. If the rotation stays the same for each square, the bar will rotate uniformly and look uninteresting.

The textured horizontal line is drawn from each corner to the next corner, from left to right. If the line would be drawn from right to left, we know it isn’t facing towards the camera, another line facing the camera will be drawn over it and we simply skip the line. The color value fetched from the texture is multiplied by the line length which makes short lines darker.

Still with me?

The Code

Initialization

First things first. We need to set the video mode before we continue. In the Win32 version we simply open a TinyPTC window, in the original version we need to tell BIOS to go in a nice 320x200x8 video mode (the de facto video mode back in the day).

C asm
ptc_open("rubba_b",320,200)
mov al,13h
int 10h

In the above code, the Win32 part is self-explanatory. The assembly bit probably needs some clarification: we put the number 13h (hex) in the 8-bit register al and we request for interrupt 10h. This is the BIOS video interrupt and since register ax (which consists of al – “low” – and ah – “high”) equals to 0013h (when the program starts, ax is always zeroed), BIOS will call the routine for changing the video mode (ah=00h) to 13h.

If above sounds complicated, don’t worry. It’s just a matter of memorization – similar to how you would memorize the function name for changing the video mode.

The next thing we need is some space for the texture, the sine table and the double buffer. In the Win32 version this is obvious, we just add a few arrays (although since TinyPTC only supports 32-bit video modes, we will also have an array for the palette). Again, in the assembly version we won’t use any fancy way to reserve memory to save some precious bytes: we simply decide certain areas of the memory will be used for whatever we want. The benefits of no memory protection and single tasking. ;)

C asm
short int sinetab[SINETAB];
unsigned char palette[256*4];
unsigned char texture[256*256];
unsigned char screen[320*200];
mov dh,80h
mov gs,dx           
mov dh,70h
mov es,dx
mov dh,60h
mov fs,dx

The assembly version basically sets the register dx to 60xxh-80xxh (we set only the high byte, i.e. dh to save bytes, thus the low byte of dx is undefined – it won’t matter) and puts the value into various segment registers (es-gs).

This makes it so that if we use the different segment registers, we can access each of the three 64 kilobyte segments as a 64 kilobyte array. E.g. the sine is in gs, thus mov ax,[gs:2] would move the second word in the sine table in ax. In C, the equivalent would be short int value=sinetab[1] (note how the C compiler knows the fact that a word is 2 bytes but in assembly you have to take care of that yourself – all memory accesses are always by the exact location, not the array element!).

All this is because in 16-bit memory accessing you can see only 64 kilobytes at one time. You can’t have a 128 KB array, nor can you have two 64 K arrays in the same segment. It’s something comparable to having a flashlight and a big sheet of paper in a dark room; you can move the light to show different areas but you will never see more than what there is in the spotlight.

The next two parts calculate the sine table (back in the day you simply could not do trigonometric stuff real-time, even in hardware — although in the intro it’s there just for show) and set the palette. This is pretty straight-forward stuff across the two versions. The only difference is that in the Windows version we have to remember the palette has 8-bit color components and the original has 6-bit components (0..255 ~ 0..63). And of course, the Windows version simply puts the values in a palette look-up table (because 32-bit video mode doesn’t use a palette) and the original actually sets the video mode colors.

I won’t reiterate the source code for the sine table and palette change here, I think you should be able to figure it out by comparing the source code. But in theory, here’s how to change a color: first, write the color index in port 3C8h, then write the three color components in port 3C9h (hint: dx first contains 3C8h and it’s incremented to 3C9h to save bytes).

The sine routine increases the angle (st0 the topmost register on the FPU) by 2*PI/32768 (a full circle is 2*PI, the sine table has 32768 elements). You probably should check some FPU tutorial, arithmetic on the 8087 is pretty interesting due to the stack-based architecture. For example, you first push in two numbers and then do an add, which (in most cases) pops out the two values and pushes in the result.

The texture generation bit is interesting. It also was annoying to port to C thanks to the fact you have to emulate how some instructions work – there are no accurate analogies in the C standard. A big thanks goes to baze whose code I originally cannibalized for this (I think). To be honest the conversion doesn’t work 100 % accurately but does still produce a nice texture.

The algorithm uses addition, bitwise operations and other simple things to achieve complexity thanks to how processors do arithmetics. Mainly, the results from an earlier calculation is carried over to the next calculation — when an addition or a subtraction overflows, i.e. the result is too large or too small to fit in a register, the processor lets the result wrap over but sets the carry flag.

This is quite similar to how you would carry numbers when calculating with a pen and a paper. The flag affects the results unpredictably because it’s used across quite different operations; usually you would just use to to add big numbers together as in the pen and paper example.

The Main Loop

Here is the meat of the code. The C version has many variables that are named after registers in order to see the connection with the original code. Sometimes, as with the 8-bit registers, some code doesn’t work exactly as in the original because you can’t do things similarly in C. E.g. you can’t have two variables that also are a part of one big register similarly how al and ah form ax (well, you can with pointers or unions but that is besides the point, kind of).

Self Modifying Code

I use self modifying code (SMC) in a few places because it produces faster and also much simpler code. For example, if you have a variable that is changed in a few places but used by one instruction only (and the instruction performs arithmetic or otherwise accepts a constant parameter), it’s much faster to store the variable where the constant for the instruction would be. That way you don’t have to read the variable in a register and then use the register to do something.

E.g. Let’s multiply cx by the variable foo:

Original SMC
  push ax ; save ax
  mov ax,[foo] ; move variable foo in ax
  imul cx,ax ; multiply cx by ax
  pop ax  ; restore ax
  ...
  mov ax,123   ; set foo ...
  mov [foo],ax ; ... to 123
  ...
foo: dw 0
  imul cx,123
foo equ $-2
  ...
  mov ax,123   ; set foo ...
  mov [foo],ax ; ... to 123

We can exploit the fact imul (signed multiplication) accepts constant multipliers. If you looked at the code with a hex editor, you’d see 123 next to the opcode. You can change the constant run-time and you do that exactly like you would change a variable: if you just define foo as the address where the constant is (the above code defines it as the last two bytes (i.e. word) of the previous instruction: in NASM, $ is the current location and $-2 equals the address of the previous word).

To be concluded…