Programming – kometbomb

Maps in MICRO MURDER (but it’s PICO-8 so it’s hard)

kometbomb — Thu, 20 Apr 2017 17:41:53 +0000

MICRO MURDER: But It’s Robots So It’s OK has a deformable map (that is, you can dig holes in it) that spans multiple screens and all the collisions seem to be per pixel – the holes on the map are not limited to tile boundaries and can be of any radius (and theoretically any shape). On a modern platform (or, even a platfrom from past, as in the original Worms was an Amiga game) it would be a very simple task: store the whole map in memory as a huge bitmap and read and write from it as you please.

On a PICO-8 this is a bit more complicated task: the machine doesn’t have enough memory for having much more than a single screen’s (128×128) worth of pixels in an array and even if it did, it would be very slow to draw pixel by pixel from the array to the screen memory.

Instead, we need to combine a few things, old and new, and apply a liberal amount of smoke and mirrors.

Checking for pixels

We start the collision checking by simply using the map. Using mget() to check for tiles is of course pretty much PICO-8 101 stuff and we can add a single step to also check pixel inside the tile:

function checkpixel(x, y)
 -- transform screen coordinates to tile
 -- coordinates (8x8 tiles)
 x = x / 8
 y = y / 8

 local tile = mget(x, y)

 if tile == 0 then
  -- the tile is not set so we know
  -- there will be no pixels
  return false
 end

 -- manually check if the pixel
 -- inside the map tile is non-zero

 -- 1. find out the pixel coordinate inside 
 -- the tile. it's what's after the decimal dot
 -- scaled to tile size (8x8 pixels)

 local sx = (x - flr(x)) * 8
 local sy = (y - flr(y)) * 8

 -- 2. find out the top-left corner of the tile
 -- inside sprite memory

 local tx = tile % 16 * 8
 local ty = flr(tile / 16) * 8

 if sget(sx + tx, sy + ty) == 0 then
   -- the pixel was zero
   return false
 end

 -- there were no empty pixels
 -- so it was a hit!

 return true
end

We can now easily check every pixel inside the map. But, how do we edit the map pixel per pixel? It’s quite simple:
we just store all the holes (circles) made by projectiles and separately check if the coordinate is inside the hole.

function checkhole(x, y)
 for hole in all(holes) do
  local dx = x - hole.x
  local dy = y - hole.y
  if dx * dx + dy * dy <= hole.r * hole.r then
   return true
  end
 end
 return false
end

If we found a pixel that isn’t inside a hole, the projectile explodes and we simply add a new hole at the coordinates (with a variable radius).

So, here’s the whole thing in short:

Check for if there’s a tile at the coordinate
Check if the coordinate is inside a hole
Check if the tile has a pixel at the “sub-tile coordinate”

Holes are drawn as white circle outlines instead of filled black circles

Note that in the above screenshot also tiles that are fully inside holes are set to empty for speed (we don’t need to check for the individual pixels because we know the whole tile is destroyed).

if checkpixel(projectile.x, projectile.y) and 
 not checkhole(projectile.x, projectile.y) then
 -- BOOM!
 add(holes, {x=projectile.x, y=projectile.y, radius=10})
end

For speed, you might want to experiment by changing the order of the checks, depending on exact circumstances. Also, you will want to use a more efficient way to store the holes if you have thousands of them.

Bouncy bombs

In MM:BIRSIO, there are projectiles that collide with the map and seem to bounce quite accurately according to the slope at the collision point. How to find out the angle from the map pixels?

This is easy: you can get a normal vector by scanning the pixels surrounding the collision point and building a vector that is the sum of the vector pointing from the collision point to the scanned pixel:

-- dx,dy = normal vector at collision point
local dy=0
local dx=0

-- scan the surrounding area, 2 pixels to any direction
for sy=-2,2 do
 for sx=-2,2 do
  if not checkpixel(collision_point.x + sx,collision_point.y + sy) then
   -- found an empty pixel, add vector sx,sy to dx,dy
   dx += sx
   dy += sy
  end
 end
end

-- normalize dx,dy
local d = sqrt(dx * dx + dy * dy)
dx /= d
dy /= d

-- proceed with the usual reflection vector 
-- calculations etc.
-- ...

This is of course just an approximation but it works perfectly fine for our purposes. And, in case of collisions, the ground probably is uneven and has tiny rocks and so on so the bounce would not be mathematically perfect.

Drawing the map

You have probably realized that we have a tiny problem if we draw the map on screen like this (assuming a black background):

cls()

-- draw the untouched, original tiles
map()

-- draw the holes over the tiles
for hole in all(holes) do
 circfill(hole.x, hole.y, hole.radius, 0)
end

The above code will draw the map and holes correctly but you are now limited to a plain black background. If you draw a background before drawing the map over it, as you usually would, the holes would clearly be just black circles stamped against the background. This would ruin the illusion of having a huge, per pixel map.

In MM:BIRSIO, we simply add more smoke: instead of having some magic to make the holes somehow invisible, the starfield is drawn after the map – we manually check the pixel color and decide if the star would be behind the map or visible. This trick would probably be too slow for a full background (drawn pixel by pixel) but for a handful of pixels it’s perfectly fine.

How levels are generated in Spaceman 8

kometbomb — Fri, 10 Mar 2017 21:01:56 +0000

All levels in Spaceman 8 are procedurally generated and get larger and more complicated as the game progresses. PICO-8 has technical limitations so the map generator has to be relatively simple. The level generator works in two steps: first by generating the general structure of the level and then adding superficial details on the map that is eventually drawn on the screen. The map contains randomly placed objects that are often found in the ends of the map corridors.

The seed map

The seed map has two types of cells: “open” and “closed” cells. Map is initialized with closed cells.
All level objects (the collectables and the exits) are treated similarly.

Start with single open cell in the center of the map
Position all objects in the starting cell
Pick a random cell on the map using these rules:
- The cell has to be closed
- Zero to one diagonal (NW, NE, SE, SW) neighbors have to be open
- 1 or 2 cardinal (N, E, S, W) neighbors have to be open
Change the cell type open and move an object from any of the neighboring cells or a randomly picked object in the new open cell
Repeat from step 3 until map is large enough

Demo cart and code

You now have a branching, mazelike map that has all open cells connected at least from one direction. The objects should be positioned (mostly) in the ends of the corridors, further from the starting position (this of course depends on luck and you should fiddle with it if you want to make sure the object locations are always challenging).

Making it pretty

The map is upscaled so that each cell now takes twice the cells vertically and horizontally. There is a random change that a cell flips its state if the neighboring cell is of the opposite type. The resizing is done again and again so that the level corridors are suitably wide (at first they are only one tile wide, then two, four etc.). The randomness from the resizing steps carries over and adds first large and then (relatively) smaller random variation to the edges.

The visible map is built from the data using a tileset that has tiles for different edge/side combinations.

The seed map is scanned cell-by-cell and if the current tile is closed, a bitmask is built from the four neighboring seed tiles in the cardinal directions (see the linked cart for working code). The bitmask tells us the tile we want on the visible map. E.g. if all neighboring seed tiles are closed, the bitmask will be binary) 1111b = 15, we know we want to use tile 15 which is a fully filled tile. If one of the sides is open, let’s say to the east, the mask will be 1101b = 13 and we will use tile 13.

If the current seed tile is empty, the visible map will simply have an empty cell. All of this is very easy to implement when just following instructions (basically just make sure the tile order is the same and it will work) but here is a good article about the technique in case you want more details.

Demo cart and code

Finally, the visible map has some random variations added on it: empty cells have a random change to have a non-black background tile and the fully filled tiles have a random change to have a pebble tile and so on. This randomness doesn’t change the level geometry.

How does Pico Racer work?

kometbomb — Sun, 03 Apr 2016 10:04:39 +0000

After I released Pico Racer, lots of people have thought it looks nice and is an achievement. I don’t think it does anything special or pushes the limits of the Pico-8. Here I try to explain the small tricks I used in the game. If you want a general tutorial about pseudo 3D racers in general then please check out Louis Gorenfeld’s excellent page about the subject, I try to focus on Pico-8 specific stuff.

Road rendering
Sprite rendering
Car rendering and sprite perspective
Night levels and color fades
Pico-8 performance

Road rendering

Usually, with pseudo 3D racers, the road is drawn using a simple trick: there is a full screen graphic of a straight road with perspective and every horizontal line gets shifted left or right, more the further down the road the current horizontal line is. This makes the road look like it’s curving. Palette tricks are used to create the familiar stripes that give an illusion of motion.

With Pico-8, there is no space for a precalculated road asset but we can draw scaled sprites instead. The road in Pico Racer is simply a series of one pixel high sprites scaled by distance, one for both track edges and two used to draw the road surface. The road is a true textured plane instead of a distorted image. Curves works similarly to the above classic method.

Additionally, when rendering the road (from bottom of the screen to the middle of the screen) we iterate along the track data. This is very approximate because we just take the screen Y, calculate the Z and figure out which part of the track the horizontal line belongs to. A more accurate way would be to iterate Z instead of the screen space Y and when Z projected to the screen gives a different screen Y coordinate, we would draw the horizontal line. But going along Y gives a good enough effect, the player will get a good enough hint what is coming towards him.

The road X position on screen is stored for each Y coordinate which we will use later to render the sprites correctly along the road. Note: this gives a slightly crude motion to sprites when they are far away because the X offset are stored per each screen Y coordinate instead of Z. As long as the action is fast, the player will not notice anything.

The road texture graphic used per line is selected so that the further away the line is, the lighter the colors in the texture. Dithering is used to mask the exact position where the texture changes.

One texture

Multiple textures

Dithered textures

This is done because of two reasons: in nature, colors get lighter the further away something is (actually, tint to sky blue beacuse there is air between you and the faraway object) and also so that the road sides and lane markings won’t strobe as much.

Lots of flicker

Less flicker

Sprite rendering

Sprites are rendered basically using the same idea as for the horizontal road lines. First, Z is used to determine the zoom factor. Secondly, sprite X position (on screen) is divided by Z and the road offset for the sprite screen Y coordinate is used to offset the sprites so that they are arranged along the road.

Car rendering and sprite perspective

There is one very important and unavoidable problem with sprites and 3D: the sprites always face the camera. This is not a problem with orthogonal 2D projection, since there is only one way the camera looks at. Everything is on one flat plane. But with a perspective projection the sprites start to look like they are always rotated to face the camera unless you take the time and draw multiple versions of the sprite from different angles. This is how Origin’s Wing Commander and Lucasarts’ Their Finest Hour work. But this takes a lot of effort and more importantly: it needs a lot of space for the sprites.

In Pico Racer, sprite perspective is handled so that most sprites (trees, warning signs) have no perspective. This works reasonably well due to two facts: the objects are two-dimensional (warning signs) AND they are always located to the sides so that the angle to them would be pretty much the same at all times. Only the player car and the opponent cars have visible perspective, since that’s what you look at the most. Also, the cars (especially the player car) have a lot of sideways motion which would immediately make the cars’ perspective look odd.

No perspective

Sprite perspective

Full perspective

The cars use two tricks to fake a convincing perspective: firstly, they have multiple versions of the sprites. Secondly, the cars are built of a number of separate sprites located at slightly different distances. I borrowed this idea from Lankhor’s Vroom and it works well with cars that have no flat sides and instead have clear “sections” – just like F1 cars have. The cars have the rear sprite, the front sprite and a middle section. When the car moves sideways, in addition to showing the sprite from an angle, the rear wheels have some sideways motion against with respect to the front wheels. And, while not a perspective thing, the front section is moved sideways when the player turns left or right, when the road has a curve or when the car bounces giving the illusion the car is yawing and pitching.

Parts of the car

Night levels and color fades

The color fades in Pico Racer are done using the palette mapping feature in Pico-8 and a few lookup tables. Basically, the tables tell which color to use for each of the 16 colors at a set brightness. Hand picking the colors gives the possibility (actually, with the fixed palette the fade will be tinted) to tint the fades so that they mimic sunset and so on. The road, sky and the sprites use a different lookup table each so that e.g. the road has markings visible even at pitch black and the sky and the horizon have a slightly different fade curve (because nature works like that).

Palette lookup table

It all works something like this:

PALETTE={
 {1,1,1,1,2,2,2,0},
 ...
}

FUNCTION SETFADE(LEVEL)
 FOR C=1,15 DO
  -- ALL COLORS WITH VALUE C ARE NOW DRAWN 
  -- WITH THE FADED COLOR
  PAL(C, PALETTE[C][LEVEL])
 END
END

On night levels, the cars are have rear lights drawn after it is dark enough. They are simply two extra sprites drawn without using the fade lookup palette.

Pico-8 performance

The Pico-8 is more than enough to do all the math and rendering in one frame (30 FPS). In fact, only the sprites rendering seems to be a bottleneck. Mainly, when sprites are zoomed, their area grows exponentially and the area is used to calculate how long Pico-8 takes to draw the sprite. Sometimes even large but fully clipped (i.e. outside the screen or the clip rectangle) sprites slow everything down. I found the combined area of the sprites is more important than the number of sprites.

In Pico Racer, I capped sprites so that they never zoom larger than 100 %. This has minimal effect on visuals, although you will easily notice it if you know it’s happening. Since e.g. the cars suddenly stop growing as they get closer to the camera, your brain thinks the cars in fact start shrinking. Likewise, since there is a short range where all the cars are drawn equally large, a car a bit further away looks larger than a car closer to the camera, everything because your brain expects things further down the road to be a bit smaller.

As for performance, limiting sprite size gives a big performance boost since as the sprites come very close to the camera and they get larger and larger, they very quickly get zoomed 200 %, 300 %, 600 % and so on. Further away they are just a dozen of pixels for a very long distance.

Since the area of sprites is the main contributing factor, this makes rendering the road quite efficient because every horizontal line is just one pixel high even though there are four sprites per horizontal line and 64 lines total.

Android NDK and SDL_RWops

kometbomb — Mon, 25 Jul 2011 17:48:53 +0000

Note: There’s now a patch for SDL that makes the same possible on older Androids as well.

The Android NDK makes it possible to use SDL to code Android apps. The aliens.c example bundled with the Android SDL port is fine and dandy except it reads data from /sdcard/data and it has to be pushed manually on the device. A nicer approach is to use the standard SDL way to load data from weird sources: with the SDL_RWops struct. We can use the Android AssetManager object to read data from the APK file (the assets directory in the project), Android 2.3 comes with that has the needed helper NDK stuff.

Now, the SDL example is for Android 1.5 (or 1.6, I forget) which means it doesn’t use NativeActivity but instead rolls its own Java wrapper. That means we don’t have simple access to the asset manager — NativeActivity makes this easy because it calls the native code with all the useful stuff ready in a struct android_app — so we have to look it up ourselves.

In short:

Get access to AAssetManager
1. Using NativeActivity, it’s in struct android_app passed to android_main()
2. If struct android_app is not available (as in the SDL example), store JNI environment in a global variable in the SDL startup code and use it in your code to get AssetManager
Use AAsset_RWFromAsset() defined below to get an SDL_RWops to a file inside the assets directory

We have to modify the SDL startup method defined in the library source code:

SDL_android_main.cpp (line 19-ish)

JNIEnv *g_env;

extern "C" void Java_org_libsdl_app_SDLActivity_nativeInit(JNIEnv* env, jclass cls, jobject obj)
{
    // Store the environment for later use.
    g_env = env;

    /* This interface could expand with ABI negotiation, calbacks, etc. */
    SDL_Android_Init(env, cls);

    /* Run the application code! */
    int status;
    char *argv[2];
    argv[0] = strdup("SDL_app");
    argv[1] = NULL;
    status = SDL_main(1, argv);

    /* We exit here for consistency with other platforms. */
    exit(status);
}

main.c

Note: fatal() is a macro that invokes the Android logging interfaces, much like the LOGE() macro.

#include "android_rwops.h"
#include 
#include 

// g_env is set in SDL_android_main.cpp
extern JNIEnv *g_env;

// This function retrieves a static member of SDLActivity called mAssetMgr
// which is initialized in the onCreate() method like so:
// ...
//   mAssetMgr = getAssets();
// ...
// You can also call the getAssets() method from the native code.

AAssetManager * get_asset_manager()
{
	jclass sdlClass = (*g_env)->FindClass(g_env, "org/libsdl/app/SDLActivity");

	if (sdlClass == 0)
	{
		fatal("org/libsdl/app/SDLActivity not found.");
		return NULL;
	}

	jfieldID assman = (*g_env)->GetStaticFieldID(g_env, sdlClass, 
                          "mAssetMgr", "Landroid/content/res/AssetManager;");

	if (assman == 0)
	{
		fatal("Could not find mAssetMgr.");
		return NULL;
	}

	jobject assets = (*g_env)->GetStaticObjectField(g_env, sdlClass, assman);

	if (assets == 0)
	{
		fatal("Could not get mAssetMgr.");
		return NULL;
	}

	return AAssetManager_fromJava(g_env, assets);
}

int main(int argc, char **argv)
{
  AAssetManager *assets = get_asset_manager();

  // You should check the return value, here we assume logo.png is found
  SDL_RWops *rw = AAsset_RWFromAsset(assets, "image.bmp");

  // Do whatever you want with the rwops
  GfxSurface *i = SDL_LoadBMP_RW(rw, 1);

  ...
}

android_rwops.h

#pragma once

#ifdef _cplusplus
extern "C" {
#endif

#include "SDL_rwops.h"
#include 

SDL_RWops * AAsset_RWFromAsset(AAssetManager *mgr, const char *filename);

#ifdef _cplusplus
}
#endif

android_rwops.c

#include "android_rwops.h"

static SDLCALL long aa_rw_seek(struct SDL_RWops * ops, long offset, int whence)
{
	return AAsset_seek((AAsset*)ops->hidden.unknown.data1, offset, whence);
}


static SDLCALL size_t aa_rw_read(struct SDL_RWops * ops, void *ptr, size_t size, size_t maxnum)
{
	return AAsset_read((AAsset*)ops->hidden.unknown.data1, ptr, maxnum * size) / size;
}


static SDLCALL int aa_rw_close(struct SDL_RWops * ops)
{
	AAsset_close((AAsset*)ops->hidden.unknown.data1);
	SDL_FreeRW(ops);

	return 0;
}


SDL_RWops * AAsset_RWFromAsset(AAssetManager *mgr, const char *filename)
{
	AAsset *asset = AAssetManager_open(mgr, filename, AASSET_MODE_RANDOM);

	if (!asset)
		return NULL;
	
	SDL_RWops *ops = SDL_AllocRW();
	
	if (!ops)
	{
		AAsset_close(asset);
		return NULL;
	}
	
	ops->hidden.unknown.data1 = asset;
	ops->read = aa_rw_read;
	ops->write = NULL;
	ops->seek = aa_rw_seek;
	ops->close = aa_rw_close;
	
	return ops;
}

Lossless Audio Compression

kometbomb — Sun, 02 Jan 2011 23:38:17 +0000

Introduction

Here’s an audio compression scheme I came up for my tracker project. Generally, I wanted an algorithm that meets these three conditions:

Has to compress mono audio losslessly
Has to be simple, should be just one source code file or 500 lines of C code at most
Should compress 8-bit audio that’s upscaled to 16-bits to near 8 bits per sample

Road rendering
Sprite rendering
Car rendering and sprite perspective
Night levels and color fades
Pico-8 performance
Introduction
The algorithm
Delta encoding
Gray encoding
Reordering data in bit planes
Run-length encoding
Improvements
Compression results
Source code

The lossless requirement is obvious and the simplicity gives the benefit of not having to rely on external libraries (e.g. when using something like RAR compression) and not being overkill. The audio data is not huge in any case so the compression ratio doesn’t have to be exceptional. The last requirement is the most important: in klystrack, it is possible to load raw 8-bit audio files, usually originally from the Amiga computer. The native audio format in klystrack is 16-bit, however, so when saving the audio data there always are 8 redundant bits and the file will generally be twice as large as the original file. Also, since the software deals with low-fidelity aesthetics (chiptunes), the audio data might be even lower quality than 8-bit.

While there’s the option of detecting the bit fidelity and only saving the most significant bits, I thought it should be quite simple to devise an algorithm that automatically detects and leaves the redundant bits out while still doing some compression for even 16-bit audio data.

The algorithm

In short, the algorithm does this:

Delta code input data
Gray code the data from above
Reorder the data from above in bit planes. I.e. first output the least significant bit of all samples, then the next bit for all samples and so forth
Run-length encode the above data exploiting the fact you only need to store the distance between bit flips

Delta encoding

By delta encoding the audio data, we can exploit the fact audio data very often changes just a little between samples and so the generated delta values (the amount of change) need fewer bits to be expressed. However, you can’t store 16-bit data with e.g. 8-bit delta values since it is not assured the change is always small enough to be stored in 8 bits, requiring the use of 16-bit delta values in the worst case.

Gray encoding

Using gray codes we can minimize the amount of bit flipping that happens (e.g. when a value changes between 7 and 8, where the bits change from 0111 to 1000 — three bits flip even if the value changes by one) to one bit flip between any two neighboring values. In practice, this doesn’t always result in better compression and sometimes it even can hurt the compression ratio.

Reordering data in bit planes

This step is the key to good compression in our case. Between two neighboring delta values, it’s common that most of the bits are the same and it’s common for one bit to stay completely static for the whole data. So, we take the bits from all samples and sort them so that all significant bits come before the lesser bits. The result is long runs of bits, usually in the most significant bits.

Run-length encoding

Finally, we take the encoded and reordered data and store the bits using RLE. While RLE usually stores the repeated data and the number of repetitions, we can exploit the fact the data is always zero or one, always changing between runs — only the starting state of the bit needs to be stored. Elias gamma coding is used to store the run-lengths since in most cases the run length is in the range of a few bits (for completely empty bit planes we can use special flags that tell they’re all zero or one) and tend to fall in 1-4 bits. With gamma coding it’s possible to store the number 1 by using one bit, the number two takes three bits.

Also, for some bit planes a raw dump of bits with no compression should be used if compression would result in much longer data than the original. There are also two special codes for the bit planes, in cases where all bits are zero or one (common if the audio data is quiet, i.e. uses only a subrange of the 16-bit resolution) or if the data is offset by a constant.

Depending on implementation the total overhead is around two bits per bit plane (i.e. 32 bits or two bytes for 16-bit audio data), not including the header data describing the compressed data (data length, uncompressed length). Since there’s the possibility of a raw dump of bits, in the worst case (none of the bit planes compress at all), the data size grows only by two bytes. A completely quiet (all zero bits) data will take only two bytes for any length.

Improvements

My implementation only compresses the whole audio data as one big chunk but this can be easily improved to compress in chunks. This should improve the compression when there are parts where the audio is quiet and most of the bits will be zero.

Stereo audio can be compressed by compressing two chunks of data, one for both channels. The other channel should be delta coded compared to the other channel since in stereo audio the two channels often heavily correlate.

Since the Gray coding may even hurt compression, there should be the option to disable the coding for chunks. This can be simply by brute force, comparing which combination of options produces the shortest compressed data.

Compression results

The following contains example compression results. The Option column tells different configurations where the input data is delta coded and/or Gray coded. This shows how different preprocessing steps work better for different kinds of input data.

Legend: 0 = no coding before compression, 1 = delta coding, 2 = Gray coding, 3 = delta and Gray coding.

Options	Ratio
Sine wave (loud)
0	73.1 %
1	29.6 %
2	67.9 %
3	22.6 %
White noise (loud)
0	100.0 %
1	100.0 %
2	99.8 %
3	100.0 %
Complex wave
0	75.3 %
1	38.2 %
2	69.6 %
3	30.0 %

Options	Ratio
Sine wave and noise
0	84.7 %
1	100.0 %
2	80.9 %
3	80.9 %
White noise (quiet)
0	50.0 %
1	100.0 %
2	50.0 %
3	55.9 %
Complex wave (saturated)
0	38.9 %
1	21.9 %
2	37.2 %
3	17.7 %

Source code

#include 
#include 
#include 
#include 

typedef unsigned char Uint8;
typedef short int Sint16;
typedef unsigned short int Uint16;
typedef int Sint32;
typedef unsigned int Uint32;


/* bitpack() bit plane codes */
enum
{
	BITPACK_STATIC0,
	BITPACK_STATIC1,
	BITPACK_LITERAL,
	BITPACK_RLE
};


/* bitstream writer helpers*/
typedef struct
{
	Uint8 *buffer;
	Uint32 byte, size_byte; 	// Read/write position counters (byte component)
	Uint8 bit, size_bit;		// Read/write position counters (bit component)
} BitPtr;

#define BIT_BLOCK_SIZE 1024 // Bitstream buffer allocation increment size

void bit_init(BitPtr *p, Uint8 *buffer, Uint32 size_bytes, Uint8 size_bits)
{
	memset(p, 0, sizeof(*p));
	p->buffer = buffer;
	p->size_byte = size_bytes;
	p->size_bit = size_bits;
}


void bit_deinit(BitPtr *p)
{
	free(p->buffer);
	memset(p, 0, sizeof(*p));
}


void bit_rewind(BitPtr *p)
{
	p->bit = p->byte = 0;
}


void bit_seek(BitPtr *p, Uint32 byte, Uint8 bit)
{
	p->byte = byte;
	p->bit = bit;
}


void bit_w(BitPtr *p, int bit)
{
	if (p->bit == 0 && !(p->byte & (BIT_BLOCK_SIZE - 1)))
	{
		// If write position modulo BIT_BLOCK_SIZE is zero, allocate a larger buffer
		p->buffer = realloc(p->buffer, p->byte + BIT_BLOCK_SIZE);
	}

	if (bit)
		p->buffer[p->byte] |= 1 << p->bit;
	else
		p->buffer[p->byte] &= ~(1 << p->bit);

	++p->bit;
	
	if (p->bit == 8)
	{
		p->bit = 0;
		++p->byte;
	}
	
	p->size_byte = p->byte;
	p->size_bit = p->bit;
}


int bit_r(BitPtr *p)
{
	if (p->size_byte <= p->byte && p->size_bit <= p->bit)
		return -1;

	int bit = (p->buffer[p->byte] & (1 << p->bit)) != 0;
	
	++p->bit;
	
	if (p->bit == 8)
	{
		p->bit = 0;
		++p->byte;
	}
	
	return bit;
}


/* Write Elias gamma coded value (has to be nonzero) */
void bit_wgamma(BitPtr *p, Uint32 value)
{
	int l = log2(value);
	
	for (int a=0; a < l; a++)
		bit_w(p, 0); //put 0s to indicate how many bits will follow
		
	bit_w(p, 1);      //mark the end of the 0s
	
	for (int a=l-1; a >= 0; a--) //Write the bits as plain binary
	{
		bit_w(p, value & 1 << a);
	}
	
}


/* Read Elias gamma coded value, zero return value signals read error */
Uint32 bit_rgamma(BitPtr *p)
{
	int numberBits = 0;
	int bit;
	while (!(bit = bit_r(p)))
		numberBits++; //keep on reading until we fetch a one...
	
	if (bit < 0) return 0;
		
	Uint32 current = 0;
	for (int a=numberBits-1; a >= 0; a--) //Read numberBits bits
	{
		if ((bit = bit_r(p)))
			current |= 1 << a;
			
		if (bit < 0) return 0;
	}
	current |= 1 << numberBits; //last bit isn't encoded!
	
	return current;
}


/* Read bits bits */
Sint32 bit_rbits(BitPtr *p, Uint8 bits)
{
	Sint32 rval = 0;

	for (int i = 0 ; i < bits ; ++i)
	{
		int bit = bit_r(p);
		
		if (bit < 0) return -1;
		
		rval |= bit << i;
	}
	
	return rval;
}


/* Write bits bits of v */
void bit_wbits(BitPtr *p, Uint32 v, Uint8 bits)
{
	for (int i = 0 ; i < bits ; ++i)
	{
		bit_w(p, v & (1 << i));
	}
}


/* Gray and delta encoding helpers */
static inline Uint16 gray(Uint16 v)
{
	return v ^ (v >> 1);
}


static inline Uint16 degray(Uint16 v)
{
	v ^= (v>>8);
    v ^= (v>>4);
	v ^= (v>>2);
	v ^= (v>>1);
	return v;
}


static void delta_encode(Sint16 *buffer, const int n)
{
	Sint32 delta = 0;
	Sint32 original;
	for (int i = 0; i < n; ++i)
	{
		original = buffer[i];
		Sint32 x = original - delta;
		buffer[i] = gray(x + 32768); // Gray code can not be negative
		delta = original;
	}
}


static void delta_decode(Sint16 *buffer, const int n)
{
	Sint32 delta = 0;
	for (int i = 0; i < n; ++i)
	{
		Sint32 v = degray(buffer[i]);
		buffer[i] = (v - 32768) + delta;
		delta = buffer[i];
	}
}


/* Compress 16-bit signed data into bitstream, return compressed data size in packed_size (in bits) */
Uint8 * bitpack(const Sint16 *_buffer, const int n, Uint32 *packed_size)
{
	BitPtr bp;
	bit_init(&bp, NULL, 0, 0);
	
	Sint16 *buffer = malloc(sizeof(Sint16) * n);
	memcpy(buffer, _buffer, sizeof(Sint16) * n);
	
	delta_encode(buffer, n);
	
	for (int plane = 0 ; plane < sizeof(*buffer) * 8 ; ++plane)
	{
		const Sint16 mask = 1 << plane;
		int bit = mask & *buffer;
		
		int type = BITPACK_STATIC0 | (bit != 0);
		
		for (int i = 0 ; i < n ; ++i)
			if ((buffer[i] & mask) != bit)
			{
				// Was not all zeros or all ones, needs to compress
				type = BITPACK_RLE;
				break;
			}
			
		Uint32 p_byte = bp.byte;
		Uint32 p_bit = bp.bit;
		
again:
		
		bit_wbits(&bp, type, 2);
			
		switch (type)
		{
			case BITPACK_LITERAL:
				for (int i = 0 ; i < n ; ++i)
					bit_w(&bp, buffer[i] & mask);
				break;
			
			case BITPACK_STATIC0:			
			case BITPACK_STATIC1:
				// No data needed, bit plane is all zeros or all ones
				break;
				
			case BITPACK_RLE:
			{
				// Write starting bit state
				bit_w(&bp, bit);
				
				Uint32 ctr = 0;
				
				for (int i = 0 ; i < n ; ++i)
				{
					if (((buffer[i] & mask) == bit))
						++ctr;
						
					if ((buffer[i] & mask) != bit)
					{
						bit_wgamma(&bp, ctr);

						ctr = 1;
						
						// Flip the bit (no neighboring bits are the same state)
						bit ^= mask;
					}
				}
				
				if (ctr != 0) bit_wgamma(&bp, ctr);
				
				if ((bp.byte * 8 + bp.bit) - (p_byte * 8 + p_bit) > n + 2)
				{
					// RLE gave longer data than the original, dump data instead
					bit_seek(&bp, p_byte, p_bit);
					type = BITPACK_LITERAL;
					goto again;
				}
				
			}	
			break;
		}
		
	}
	
	free(buffer);
	
	*packed_size = bp.byte * 8 + bp.bit;
	
	return bp.buffer;
}


/* Decompress compressed bitstream into 16-bit signed data, decompress at most packed_size bits
	unpacked_size tells the function the data length (important)
 */
Sint16 * bitunpack(const Uint8 *packed_data, const Uint32 packed_size, Uint32 unpacked_size)
{
	BitPtr bp;
	bit_init(&bp, (Uint8*)packed_data, packed_size / 8, packed_size & 7);
	
	Sint16 *buffer = calloc(unpacked_size, sizeof(Sint16));
	
	for (int plane = 0 ; plane < sizeof(*buffer) * 8 ; ++plane)
	{
		const Sint16 mask = 1 << plane;
		
		Sint32 type = bit_rbits(&bp, 2);
		
		if (type < 0) goto read_error;
			
		switch (type)
		{
			case BITPACK_LITERAL:
				for (int i = 0 ; i < unpacked_size ; ++i)
				{
					int bit = bit_r(&bp);
					if (bit < 0) goto read_error;
					if (bit) buffer[i] |= mask;
				}
				break;
				
			case BITPACK_STATIC0:
				// Data bits are zero by default, no action needed
				break;
				
			case BITPACK_STATIC1:
				// Fill bit plane with set/unset bit
				for (int i = 0 ; i < unpacked_size ; ++i)
					buffer[i] |= mask;
				break;
				
			case BITPACK_RLE:
			{
				// Read the starting bit status
				int bit = bit_r(&bp);
				
				if (bit < 0) goto read_error;
				
				if (bit) bit = mask; else bit = 0;
				
				buffer[0] |= bit;
				
				for (int i = 0 ; i < unpacked_size ; )
				{
					Uint32 ctr = bit_rgamma(&bp);
					
					if (ctr == 0) goto read_error;
					
					for (; i < unpacked_size && ctr ; ++i, --ctr)
						buffer[i] |= bit;
					
					if (ctr) goto read_error;
					
					// Flip the bit (neighboring bits are always different)
					bit ^= mask;
				}
			}		
			break;
		}
	}
	
	delta_decode(buffer, unpacked_size);
	
	if (0)
	{
read_error:
		free(buffer);
		return NULL;
	}
	
	return buffer;
}


/* Generate test data, 0 is a sine wave and 1 generates noise */
void getdata(Sint16 *buffer, const int n, int t)
{
	switch (t)
	{
		case 0:
			for (int i = 0 ; i < n ; ++i)
			{
				buffer[i] = (((int)(sin((float)i / 1000 * 3.141 * 2) * 20000)));
			}
			
			break;
			
		case 1:
			for (int i = 0 ; i < n ; ++i)
				buffer[i] = (rand() * 4);
			
			break;
	}
}


int main(int argc, char **argv)
{
	const int n = 100000;

	Sint16 *b = malloc(sizeof(Sint16) * n);
	Uint32 packed_size;
	
	getdata(b, n, 0);
	
	Uint8 * data = bitpack(b, n, &packed_size);
	Sint16 * udata = bitunpack(data, packed_size, n);
	
	if (udata == NULL || memcmp(b, udata, sizeof(Sint16) * n) != 0)
	{
		printf("Upack failed\n");
	}
	else
	{
		printf("Compression ratio %.1f %%\n", (float)100 * packed_size / (n * sizeof(Sint16) * 8));
		
	}
		
	free(data);
	free(udata);
	free(b);
	
	return 0;
}

An Alternative to XML

kometbomb — Sun, 28 Feb 2010 18:42:06 +0000

My new weapon of choice. libconfig is a configuration file parser that supports arrays, named and typed members, selection by path (e.g. cfg.users.[3].name) and more. That is, it has basically all the useful (as in 80% of cases) features of XML and none of the bad. There’s a minimal but well defined structure that will work for most situations and that can be used to enforce e.g. all array items having to be of the same type. There’s no overkill markup so it’s easy to read and write by humans. The library can also write the settings tree into a text file.

The configuration files look like this:

screen = { width = 300; height = 200; }
users = ( { name = "Torgo"; items = [ "Item 1", "Item 2" ]; } );

And in C you would do something like this:

int screen_height = 100;
const char *name;
config_init(&cfg);
config_read_file(&cfg, "config");
if (!config_lookup_int(&cfg, "screen.height", &screen_height))
  puts("Using default screen height");
if (config_lookup_string(&cfg, "users.[0].name", &name))
  puts(name);
config_destroy(&cfg);

You can also iterate the setting tree without the path for easier array or tree traversal. In all, I would say it involves less work compared to any XML library, especially in C. I like to think it’s a good example of software designed by the same guy who also uses it and not by some external committee.

Creating a Simple GUI from Scratch

kometbomb — Thu, 31 Dec 2009 17:27:20 +0000

Introduction

More often than not, the biggest obstacle for a programmer is adding a GUI to a otherwisely ready application. This probably is because you generally can create a program that inputs and outputs text with a few lines of code in any language. However, if you don’t count MessageBox() and other similar helper APIs that are available in most windowing systems and that are as simple to use as gets() and printf(), it’s a unnecessarily big step to change a command line program into a program that outputs the exact same text but with complex command line arguments replaced by a few buttons and a window with the outputted text.

However, if you are developing a program that already draws something on the screen, it’s really easy to add simple mouse interactivity. This is especially true if you are using SDL (even if it’s extremely bare bones in these things!) and already use SDL_rect (your standard rectangle) to draw things. You can simply change the draw routine so that it takes one more argument which would be the SDL_event struct you’re already using to check for a quit message and so on. Then for every object you are drawing on screen, check if the event if a mouse button down event and that the coordinates are inside the rectangle that will be drawn. This eliminates completely the need for having to plan a separate system that interprets the mouse events. It’s sort of piggybacking on the existing code with minimal changes to the existing code.

Now, you might already think that is way too simplistic and lazy but think again. Not many programs need a more complex GUI with multiple movable windows and so on. If you really do need that, then be my guest and create a exhaustive windowing system or take the time to learn an existing system. But it is still overhead and you have to do comparatively a lot of work before any real results. At least I hate that kind of mental overhead. And while this whole idea of combining the drawing and the event processing sounds like a bit of a hack, it really isn’t a “hack” as in patching something you will probably have to replace with a better solution later. It’s just a different way of doing almost the same thing. With less overhead.

I have used this approach in my latest project and from direct experience I know it is possible to create scrollbars, scrolling text fields, text input fields, message boxes and pretty much anything I have needed. And it’s not too much code either even if you have to create absolutely everything from nothing. In other words I do not feel limited or burdened.

How to do it

As explained above, you most likely are drawing to a specific region on the screen for every object you need to check for mouse clicks. It is not necessary to have any kind of array where those regions are. You need to make sure that for every mouse click event runs the draw loop once and that the mouse click event gets passed to the draw routine. Then check if the coordinates are inside the draw region.

This is where the part starts that it admittedly gets a bit hacky: if a button is pressed, the event that the button triggers will be run in the middle of the draw loop (unless you somehow buffer the events). In most cases this doesn’t matter at all but it could be that what is on the screen is not exactly how it really is; you might have two selected items for a duration of one frame and so on. For the most part this doesn’t matter since you are getting results with very little overhead.

Consider this example.

void draw_stuff()
{
  SDL_Rect position = {10, 10, 40, 40};

  SDL_BlitRect(button_gfx, NULL, screen, &position);
}

void draw_stuff_and_check_events(SDL_Event *event)
{
  SDL_Rect position = {10, 10, 40, 40};

  if (event->type == SDL_MOUSEBUTTONDOWN 
   && event->button.x >= position.x
   && event->button.y >= position.y  
   && event->button.x < position.x + position.w 
   && event->button.y < position.y + position.h)
     button_event();

  SDL_BlitRect(button_gfx, NULL, screen, &position);
}

void event_loop()
{
  SDL_Event event;

  while (SDL_PollEvent(&event))
  {
    if (event.type == SDL_QUIT) break;

    draw_stuff(&event);
  }
}

From the above example, the benefits of this approach are obvious. The event checking can simply be injected anywhere in the code as long as you have the event and a region.

Solutions to common problems

One downside in this is that if you draw multiple overlapping regions, the region drawn last will be the one that is visible on the screen but the click is handled by the first region. Our event checking sees the regions from the other side of the screen. In such cases you can first iterate the regions in reverse order checking the event and then draw them in the correct order. An example in the mentioned project is the menu, submenus often overlap the parent menu; I solved that by first going through the menus in the order the user sees them and then drawing them back to front.

Dragging items is easy: simply check for mouse motion events and if the button is held down and the mouse is moved, just adjust the position of the matching region. You do not need any special “drag starts now” and “drag ends now, update objects” phases. However, this simplistic method is subject to the previous issue if you move a region over one that seems to be under it. You can also simply record the clicked object when the mouse button is pressed and make the motion events match only the selected region.

FAQ

I absolutely need a separate window

This is possible as long as the window can be modal (like a message or open file dialog that takes over from the window behind it). You simply jump to another event loop until the new window is closed, much like it’s done in the Windows API with GetOpenFileName() or the MessageBox() mentioned earlier. Then it’s just a matter of drawing the new window and checking for the events normally.

Is this reusable?

Of course. You could create a small library that has basic functionality and helper functions. A practical example can be seen here.

Even if you can use the event checking straight in the source code, you can still define the regions in an array and link to relevant event handlers.

Conclusion

Combining the drawing and the event processing code can save time in the short term. Many common GUI elements are perfectly possible to replicate. The idea described above should be considered if a project needs mouse interaction and external libraries are not available, the conversion of existing code seems expensive or the learning curve is too steep compared to the future benefits. Nonetheless, in borderline cases, it can be well worth prototyping due to the minimal overhead and developing considerations.

Plenty of Room, Part II

kometbomb — Mon, 09 Nov 2009 19:47:17 +0000

This is the second part of the epic (~~two~~ three-part) series of articles about tiny intros, the previous part was an essay about 256-byte intros in general.

Ed. note: Since this article seems to take forever to finish, here’s the first half of it. The (hopefully) final part will detail more of the specifics.

As stated earlier, tiny intros written in assembly language fascinate me. I have written a few in x86 assembly language, here’s one of them. I have tried to make the inner workings of the program as accessible — or, at least as thought-provoking — as possible even if assembly wasn’t their weapon of choice.

I’ve included the original DOS intro (you can use DOSBox to run it, it should also work on Windows XP) and a Win32 port of it, written in C while trying to keep the original structure intact. I’ll also try to explain the general idea of the effect in pseudocode where syntax can be an obstacle. The archive rubba_c.zip contains the source code, rubba_b.exe which is the compiled Win32 executable and RUBBA.COM which is the 16-bit MS-DOS executable. To compile the C program, you need the TinyPTC library (like SDL but even more bare-bones).

I won’t go into details about x86 assembly language, I suggest anyone interested first reads an introduction and learns some of the basics. However, I’ll try to make the code look interesting, explain some weird or obfuscated code and probably show some basic size optimization tricks.

The Effect

The intro, called rubba_b, shows one effect: a twisting bar drawn using primitive light-shaded texture mapping. The color palette and the texture are generated run-time. The texturing is done using linear interpolation and no vector math is used even if the bar looks like it is rotated. The lighting is an extremely simple approximation of the light source being exactly where the camera is located. That is, the length of the textured line directly determines the light.

If looked from above, the bar will be a tower of squares. If one of the squares is rotated around the center, the corners will move in a circular pattern. So, the X coordinate will be equal to cos(corner_number*(360 degrees/4)+square_rotation), the Z coordinate (why Z? Because it goes towards the screen) is equal to the sine but it can be discarded because we will not need it for perspective calculation. Remember, we’re short on bytes.

We then modulate the bar rotation for each square in the tower. If the amount of rotation changes when moving up or down the tower, the bar will twist. If the rotation stays the same for each square, the bar will rotate uniformly and look uninteresting.

The textured horizontal line is drawn from each corner to the next corner, from left to right. If the line would be drawn from right to left, we know it isn’t facing towards the camera, another line facing the camera will be drawn over it and we simply skip the line. The color value fetched from the texture is multiplied by the line length which makes short lines darker.

Still with me?

The Code

Initialization

First things first. We need to set the video mode before we continue. In the Win32 version we simply open a TinyPTC window, in the original version we need to tell BIOS to go in a nice 320x200x8 video mode (the de facto video mode back in the day).

C	asm
`ptc_open("rubba_b",320,200)`	`mov al,13h int 10h`

In the above code, the Win32 part is self-explanatory. The assembly bit probably needs some clarification: we put the number 13h (hex) in the 8-bit register al and we request for interrupt 10h. This is the BIOS video interrupt and since register ax (which consists of al – “low” – and ah – “high”) equals to 0013h (when the program starts, ax is always zeroed), BIOS will call the routine for changing the video mode (ah=00h) to 13h.

If above sounds complicated, don’t worry. It’s just a matter of memorization – similar to how you would memorize the function name for changing the video mode.

The next thing we need is some space for the texture, the sine table and the double buffer. In the Win32 version this is obvious, we just add a few arrays (although since TinyPTC only supports 32-bit video modes, we will also have an array for the palette). Again, in the assembly version we won’t use any fancy way to reserve memory to save some precious bytes: we simply decide certain areas of the memory will be used for whatever we want. The benefits of no memory protection and single tasking. ;)

C	asm
`short int sinetab[SINETAB]; unsigned char palette[2564]; unsigned char texture[256256]; unsigned char screen[320*200];`	`mov dh,80h mov gs,dx mov dh,70h mov es,dx mov dh,60h mov fs,dx`

The assembly version basically sets the register dx to 60xxh-80xxh (we set only the high byte, i.e. dh to save bytes, thus the low byte of dx is undefined – it won’t matter) and puts the value into various segment registers (es-gs).

This makes it so that if we use the different segment registers, we can access each of the three 64 kilobyte segments as a 64 kilobyte array. E.g. the sine is in gs, thus mov ax,[gs:2] would move the second word in the sine table in ax. In C, the equivalent would be short int value=sinetab[1] (note how the C compiler knows the fact that a word is 2 bytes but in assembly you have to take care of that yourself – all memory accesses are always by the exact location, not the array element!).

All this is because in 16-bit memory accessing you can see only 64 kilobytes at one time. You can’t have a 128 KB array, nor can you have two 64 K arrays in the same segment. It’s something comparable to having a flashlight and a big sheet of paper in a dark room; you can move the light to show different areas but you will never see more than what there is in the spotlight.

The next two parts calculate the sine table (back in the day you simply could not do trigonometric stuff real-time, even in hardware — although in the intro it’s there just for show) and set the palette. This is pretty straight-forward stuff across the two versions. The only difference is that in the Windows version we have to remember the palette has 8-bit color components and the original has 6-bit components (0..255 ~ 0..63). And of course, the Windows version simply puts the values in a palette look-up table (because 32-bit video mode doesn’t use a palette) and the original actually sets the video mode colors.

I won’t reiterate the source code for the sine table and palette change here, I think you should be able to figure it out by comparing the source code. But in theory, here’s how to change a color: first, write the color index in port 3C8h, then write the three color components in port 3C9h (hint: dx first contains 3C8h and it’s incremented to 3C9h to save bytes).

The sine routine increases the angle (st0 the topmost register on the FPU) by 2*PI/32768 (a full circle is 2*PI, the sine table has 32768 elements). You probably should check some FPU tutorial, arithmetic on the 8087 is pretty interesting due to the stack-based architecture. For example, you first push in two numbers and then do an add, which (in most cases) pops out the two values and pushes in the result.

The texture generation bit is interesting. It also was annoying to port to C thanks to the fact you have to emulate how some instructions work – there are no accurate analogies in the C standard. A big thanks goes to baze whose code I originally cannibalized for this (I think). To be honest the conversion doesn’t work 100 % accurately but does still produce a nice texture.

The algorithm uses addition, bitwise operations and other simple things to achieve complexity thanks to how processors do arithmetics. Mainly, the results from an earlier calculation is carried over to the next calculation — when an addition or a subtraction overflows, i.e. the result is too large or too small to fit in a register, the processor lets the result wrap over but sets the carry flag.

This is quite similar to how you would carry numbers when calculating with a pen and a paper. The flag affects the results unpredictably because it’s used across quite different operations; usually you would just use to to add big numbers together as in the pen and paper example.

The Main Loop

Here is the meat of the code. The C version has many variables that are named after registers in order to see the connection with the original code. Sometimes, as with the 8-bit registers, some code doesn’t work exactly as in the original because you can’t do things similarly in C. E.g. you can’t have two variables that also are a part of one big register similarly how al and ah form ax (well, you can with pointers or unions but that is besides the point, kind of).

Self Modifying Code

I use self modifying code (SMC) in a few places because it produces faster and also much simpler code. For example, if you have a variable that is changed in a few places but used by one instruction only (and the instruction performs arithmetic or otherwise accepts a constant parameter), it’s much faster to store the variable where the constant for the instruction would be. That way you don’t have to read the variable in a register and then use the register to do something.

E.g. Let’s multiply cx by the variable foo:

Original SMC

  push ax ; save ax
  mov ax,[foo] ; move variable foo in ax
  imul cx,ax ; multiply cx by ax
  pop ax  ; restore ax
  ...
  mov ax,123   ; set foo ...
  mov [foo],ax ; ... to 123
  ...
foo: dw 0

  imul cx,123
foo equ $-2
  ...
  mov ax,123   ; set foo ...
  mov [foo],ax ; ... to 123

We can exploit the fact imul (signed multiplication) accepts constant multipliers. If you looked at the code with a hex editor, you’d see 123 next to the opcode. You can change the constant run-time and you do that exactly like you would change a variable: if you just define foo as the address where the constant is (the above code defines it as the last two bytes (i.e. word) of the previous instruction: in NASM, $ is the current location and $-2 equals the address of the previous word).

To be concluded…

Introducing My Latest Projects

kometbomb — Tue, 22 Sep 2009 01:34:30 +0000

… Or, How to Procrastinate Productively.

I decided to make one of my current projects open source and post them on Google Code just for fun. The project is a tool chain that I’m using to remake Thrust. In reality, I decided to divide the project into two separate projects: the actual game engine (called klystron) and related tools, and a music editor that uses the engine.

Here are two videos I made a while ago that demonstrate the engine. The first is the music editor (called klystrack) — it’s much less ugly at the moment but the sound synthesis is the same, and that’s what matters:

The sound engine (“Cyd”) is basically a very simple software synthesizer with capabilities comparable to the SID or any 8-bit machine from the 80s. The editor is a fairly standard tracker, much like GoatTracker.

The graphics half of the engine is basically a wrapper around a quite fast collision detection system (pixel-accurate, or it wouldn’t be much good for a thrustlike) built on SDL. It also does background collisions and drawing as well. As you may have guessed, the whole point is to provide a limited but still helpful set of routines that are useful for creating 2D games not unlike what video games were in 1991.

And, here’s a proof I’m actually working on the actual game (the sound effects are created in real time by the sound engine):

A note on Google Code: it’s rather nice. It provides the standard open source development stuff like source control an such but I really like how clean and hassle-free it is. Adding a project takes a minute and after that it’s simply coding and some quick documentation on the project wiki. The project wiki is good example of how simple but elegant the system is: the wiki pages actually exists inside the source control as files, just like your source code.

Go check Google Code out and while you’re at it, contribute on my projects. :)

You Can Stop Programming Now

kometbomb — Tue, 08 Sep 2009 13:36:26 +0000

The above is puls, a 256-byte intro by ?r?ola. It’s basically a raytracer with screen space ambient occlusion (which makes it so much realistic and cooler). While tube — which I think was the best 256-byte intro until now (when design and code are judged together) — also did raytracing of a cylinders, and after that many other intros did similar tracing of more complex surfaces, puls simply crushes all of them with objects that are formed by multiple plane surfaces (e.g. a cube would be a combination of six intersecting planes), a very nice color palette and that delicious ambient occlusion.

Thinking it out in C and sketching it out in asm took about a week, byte crunching took another one… that’s like forty hours of full focus and eighty of playing.

It’s also really, really slow which is the only minus especially because you can’t run 16-bit executables on Windows 7, so you have to use DOSBox to watch it (or, use a boot floppy to run it or something). There’s now a Windows port including a screensaver, see the Pouet.net page for more. A big thank you to nordak5 who was kind enough to upload a video on Youtube.

?r?ola has also included source code with the binary that you can find over here. That said, I’ll be deleting all my own source code since perfection has finally been achieved and there is no need for programmers anymore.

Programming – kometbomb

Maps in MICRO MURDER (but it’s PICO-8 so it’s hard)

Checking for pixels

Bouncy bombs

Drawing the map

You might also like...

How levels are generated in Spaceman 8

The seed map

Making it pretty

You might also like...

How does Pico Racer work?

Road rendering

Sprite rendering

Car rendering and sprite perspective

Night levels and color fades

Pico-8 performance

You might also like...

Android NDK and SDL_RWops

SDL_android_main.cpp (line 19-ish)

main.c

android_rwops.h

android_rwops.c

You might also like...

Lossless Audio Compression

Introduction

The algorithm

Delta encoding

Gray encoding

Reordering data in bit planes

Run-length encoding

Improvements

Compression results

Source code

You might also like...

An Alternative to XML

You might also like...

Creating a Simple GUI from Scratch

Introduction

How to do it

Solutions to common problems

FAQ

I absolutely need a separate window

Is this reusable?

Conclusion

You might also like...

Plenty of Room, Part II

The Effect

The Code

Initialization

The Main Loop

Self Modifying Code

You might also like...

Introducing My Latest Projects

You might also like...

You Can Stop Programming Now

You might also like...