Making floating point numbers smaller

When making a demo in 64kB (or less!), many unexpected issues arise. One issue is that sitting floating point numbers can take quite a lot of space in the binary file. Floats are found everywhere: position of objects in the world, position of the camera, constants for the effects, colors in the texture generator, etc. In practice, we often don’t need as much precision as offered by floats. Can we take advantage of that to pack more data in a smaller space?

In many cases, it’s not important if an object is 2.2 or 2.21 meters high. Our goal is to reduce the amount of space used by those numbers. A float takes 4 bytes (8 bytes for a double). The actual size on disk can be reduced a bit with compression, but when there are thousands of floats, it’s still quite big (compression doesn’t work well as the binary data looks quite random). We can do better.

A naïve solution

Suppose we have some numbers between 0 and 1000, and we need a precision of 0.1. We could store those numbers as integers between 0 and 10000 and then divide by 10. Whether we use 32 bit or 16 bit integers in the code doesn’t make a difference: since we don’t use their full range of values, all these integers start with leading 0s. The compression code will detect such repetitive 0s and use around 13 bits per number in both cases.

The problem with this solution is that we need to do some processing at runtime. Each time we use a number, we have to convert it to a float and divide it by 10. If all our data is in a same place, we can loop over it. But if we have numbers all over our code base, we’ll also need to call a processor in all those places. This simple operation can be cumbersome and expensive in terms of space.

It turns out we can get rid of the processing, and directly use floating point numbers.

A note on IEEE floats

Floats are stored using the IEEE 754 standard. Some of them have a binary representation that contains lots of 0 and compress better than others.

Let’s look at two examples using a binary representation. The IEEE representation is not exactly the same as in the example below (it has to store the exponent), but almost.

  • 6.25 -> 110.01
  • 6.3 -> 110.010011…

In fact, 6.3 has no exact representation in base 2: the number stored is an approximation, and it would require an infinite number of digits to represent 6.3. On the other hand, the binary representation for 6.25 is compact and exact.

If we’re optimizing for size, we should prefer numbers like 6.25, that have a compact binary representation. For example, 0.125, 0.5, 0.75, 0.875 have at most 3 digits in binary after the decimal mark. The binary representation will have a lot of 0s at the end of the number, which will compress really well. The great thing is that we don’t need processing code anymore because we’re still using standard floats. We just store normal floats, but we try to use floats that include lots of 0 bits and will compress well.

To better understand IEEE representation, try some tools to visualize the floats. You’ll see how removing the last 1s will reduce the precision.

How much precision do we need?

Floats are much more precise for values around 0. As our numbers get bigger, we’ll have less and less precision (or we’ll need more bits).

The table below is useful to check how much precision is needed. It tells you the worst error to expect based on the number of bits, and the scale of the input numbers. For example, if the input numbers are around 100 and we use 16 bits per float, the error will be at most 0.25. If we want the error to be less than 0.01, we need 21 bits per float.

Of course, each time you add a bit, you divide by two the expected error.

How to automate it?

An ad hoc solution is to remember this list of numbers and use them in the code when possible: 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875. An alternative is to use a list of macros from Iñigo Quilez. As Iñigo points out, this is not very elegant. Fortunately, this is hardly a problem because chances are this is not where most of your data lies.

64kB can actually contain a lot of data. Developers often rely on tools and custom editors to quickly modify and iterate on the data. In that case, we can easily use code to truncate the floating point numbers as part of the process. If you use a tool to set the camera (instead of manually entering the position numbers), that tool could round the floats for you.

Here is the function we use to round the binary representation of the floats:

// roundb(f, 15) => keep 15 bits in the float, set the other bits to zero
float roundb(float f, int bits) {
  union { int i; float f; } num;

  bits = 32 - bits; // assuming sizeof(int) == sizeof(float) == 4
  num.f = f;
  num.i = num.i + (1 << (bits - 1)); // round instead of truncate
  num.i = num.i & (-1 << bits);
  return num.f;
}

Just pass the float, choose how many bits you want to keep, and you’ll get a new float that will compress much better. If you generate C++ code with that number, be careful when printing it (make sure you print it with enough decimals):

printf("%.10ff\n", roundb(myinput, 12));

The great thing about this function is that we decide exactly how much precision we want to keep. If we desperately need space at some point, we can try to reduce that number and see what happens.

By applying this technique, we’ve managed to save several kilobytes on our 64kB executable.
Hopefully you will, too.

Launch scene

New release: final version of H – Immersion

After months of polishing, we’ve finally released the final version of our latest 64kB intro: H – Immersion. You can read the details, download the binary or just watch the captured video from the production page.

We’re also currently doing a write up to show some of the techniques involved in this intro, which we’re hoping to publish here soon.

How can demoscene productions be so small?

People not familiar with the demoscene often ask us how it works. How is it possible that a 64kB file contain so much? It can seem magical, since a typical music compressed as mp3 can be 100 times as big as our animations – not to mention the graphics. People also ask why other programs or games are getting so big. In 1990, when games had to fit on one or two floppy disks, they used only 1 or 2MB (which is still 20 times as much as our 64kB intros). Modern games now use 10-100 GB.

The reason for that is simple: Software engineering is all about making trade-offs. The typical trade-off is to use more memory to improve performance. But when you write a program, there are many more dimensions to consider. If you optimize on one dimension, you might lose on the other fronts. We make optimizations and trade-offs that wouldn’t make any sense outside the demoscene.

First, we optimize all the data we store in the binary. We use JSON files during the development for convenience, but then we generate compact C++ files to embed in the binary. This saves us a few kilobytes. Is it worth doing it? If you had to make a demo without the 64kB limit, you wouldn’t waste time on this. You’d prefer the 70kB executable instead. It’s almost the same.

Then, we compress our file (kkrunchy for 64kB intros, crinkler for 4kB intros). Compression slows down the startup time and antivirus software may complain about the file. It’s generally not a good deal. I bet you’ll choose the 300kB file instead. It’s still small, right?

We use compiler optimizations that will slightly slow down the execution to save bytes. That’s not what most users want. We disable language features like C++ exceptions, we give up object oriented programming (no inheritance) and we avoid external libraries – including the STL. This is a bad trade-off for most developers, because this slows down the development. Instead of rewriting existing functions, you’ll prefer the 600kB file.

Our music is computed in real-time (more precisely, we start a separate thread that fills the audio buffer). This means that our musician has to use a special synth and cannot use his favorite instruments. That’s a huge constraint that very few musicians would accept outside the demoscene. They will send you a mp3 file instead. You also need a mp3 player, and your demo is now 10MB.

Similarly, we generate all textures procedurally. And all the 3D models. For that, we write code and this is a lot of work. This adds a huge constraint on what we do (but constraints are fun and make us more creative). While procedural texture have lots of benefits, your graphists will prefer using their normal tools. You get JPEG images and – even if you’re careful – your demo size increases to 20MB.

At this point, you may wonder if it makes sense to write your own engine. It’s hard, error-prone and other people probably made one better than you would. You could use an existing engine and it would add at least 50MB. Of course, it’s still a simple application made by a small team, you can imagine what happens when you scale this up to a full game studio.

So demosceners achieve very small executable sizes because we care deeply about it. In many regards, demoscene works are an art form. We make decisions meant to support the artistic traits we’re pursuing. In this case, we’re willing to give up development velocity, flexibility, loading time, and a lot of potential content to fit everything in 64kB. Is it worth it? No idea, but it’s a lot of fun. You should try it.

Immersion

It’s been 4 years since the last message here. Sorry about that.

It’s time for a quick update. Here is what we did since the last blog post:

  • G – Level One was released at Tokyo Demofest 2014 and got the 1st place. One year later, we made a final version.
  • We released H – Immersion last week at Revision. It ranked 5th (out of 10). If you haven’t seen them, you should check the other prods of the competition.
  • We’ve put the source of Felix’s Workshop on github.

What’s next?

  • We’re working on a final version for Immersion. Sorry that the party version was not polished enough.
  • I’d like to write a detailed making-of article for Immersion.
  • We’re going to update Shader Minifier.

F – Felix’s workshop to be shown at SIGGRAPH 2013

This is the breaking news that crashed into our mailboxes yesterday. The PC 64kB Intro we released last year at Revision, F – Felix’s workshop, has been selected to be shown at SIGGRAPH 2013 as part of the Real-Time Live! demoscene reel event.

Unfortunately we won’t be able to attend SIGGRAPH this year, but to have our work there is quite some awesome news.

Back from Revision

I don’t know if this is going to become some sort of tradition for us, but as a matter of fact, we attended all Easter parties since the creation of our group. This year was no exception, and we had a really great time at Revision.

Revision is the kind of party that is just big enough so even though at some point you think “Ok, I’ve met pretty much every one I wanted”, when you get home you realize how many people you wanted to meet and did not. It’s also the kind of party that is so massively awesome that when you get back to your normal life, you experience some sort of post-party depression, on top of the exhaustion, and you have to get prepared for when it strikes.

Sidrip Alliance performing at Revision

So we’ve been there, and this year we presented the result of the last months of work in the PC 64k competition. The discussion of the concept started back in May 2011, and we seriously started working on it maybe around August.

While Revision was approaching, rumors were getting stronger about who would enter the competition, how serious they were about it, and how likely they’d finish in time. It became very clear that the competition was going to be very interesting, but even though, it completely outran expectations. It even got mentioned on Slashdot!

Our intro, F – Felix’s workshop, ended up at the 2nd place, after Approximate‘s gorgeous hypno-strawberries, Gaia Machina. The feedback has been very cheerful, during the competition as well as thereafter. Also, as if it was not enough, to our surprise, our previous intro, D – Four, has been nominated for two Scene.org Awards: Most Original Concept and Public Choice. Do I need to state we’re pretty happy with so many good news? :) Thank you all!

Now a week has passed already, we’re back at our daily lives, slowly recovering, and already thinking of what we’re going to do next. :) Until then, here is a capture of our intro:

Shader Minifier 1.1

I’ve just released Shader Minifier 1.1. You can download the binary at the usual place.

Changes

  • New output options: use --format js to generate a Javascript file, and --format c-array to get a comma-separated list of strings (to be included in a C array).
  • Use the new option --no-renaming-list if there are identifiers you don’t want to get renamed (e.g. entry point functions in HLSL)
  • If you have #define macros, Shader Minifier will now avoid conflicts between macros and identifier renaming.
  • If your code has conditions with compile-time known values, they will get simplified (e.g. if (false), or int i_tag = 2; if (i_tag < 4) ...).
  • If there are many identifiers in your code (this probably won’t happen in a 4k intro), Minifier will now use 2-letter names if needed. This is not correctly optimized, but some people use this tool for obfuscation, instead of size-optimization.
  • The option --preserve-all-globals won’t rename any global variable or any function. This is useful if your shader is split between multiple files.
  • You can now tell the Minifier not to parse some block of code. Put your code between //[ and //] and it won’t be parsed, identifiers won’t be renamed, but spaces and comments will still be removed. This is very useful if you want to use features not yet supported in the tool (e.g. forward declarations, or layouts).
  • Other fixes in the parser (macros can be used in a function block, numbers can have suffixes, etc.)
  • Unix support. Use this zip file instead of the standard release, install a recent version of Mono (at least 2.10) and run mono shader_minifier.exe. This should work.
  • Online Minifier. You can use shader minifier online, but only one shader at a time.

Who is using it?

During the last year, many great intros have used it with great success.

  • Another Theory, by FRequency (#1 at Main 2010)
  • white one by Never (#1 at the Ultimate Meeting 2010)
  • anglerfish by Cubicle (#1 at Assembly 2011)
  • RED by BluFlame (#1 at Evoke 2011)
  • akiko by flopi (#2 at Riverwash 2011)
  • fr-071: sunr4y by farbrausch (#1 at Sunrise 2011)

The Easter trip

Every year, we enjoy going to the German demoparty at Easter. It’s always a great party, where we meet our friends again. However, we’ve been invited this year to the Scene.org Awards ceremony in Norway, and it was hard to make a choice between them. We eventually decided to follow Truck’s suggestion and go to both (thanks again Truck, you’ve been extremely helpful). We chose to take ten days of holiday, and visit Norway at the same time (Bergen, Oslo, fjords, etc.). Everything was awesome (and very expensive!).

The Gathering is a good party, it is huge, well organized. There are many compos, seminars, separate area for sceners. Alcohol is not allowed, but there is Easter Garden, a small and cosy place for sceners, next to it. Having both parties is a great idea, as they complement each other. The main problem with The Gathering is that the screen and the sound system are not big enough to cover the whole huge place, so many people (gamers) end up ignoring the compo shows.

Scene.org awards

Fairlight statues in front of our nominee certificate.

I’ve been very happy with the Scene.org Award organisation. The team has done a wonderful job, both for the show and the after party, thank you very much for this!

Revision was a great party, with almost the same mood as Breakpoint. Beer was cheap (especially when you come from Norway), organizers were friendly. Revision has been the occasion for Zavie and Cyborg Jeff to release D – Four (ranked 3/11).

Everyone should go to a demoparty at Easter.

Achievement unlocked

We have been taught that B – Incubation was nominated for the Breakthrough Performance category of the Scene.org Awards 2010. Needless to say this was an awesome news, that made our day (and probably the upcoming ones too; being nominated certainly does not mean actually winning the award, but that’s still quite something).

Whoever played a role in this: thank you very much!

As a side note, I’m pretty happy to see that most of the productions I hoped to see being nominated, were nominated. :-)

Shader Minifier 1.0

Since last release, the number of users of GLSL Minifier has increased, while number of bug reports decreased. I think it’s a good reason to move to version 1.0. This obfuscator has been used in a few great intros, such as Another Theory (winner at Main) and White one (winner at tUM). Hopefully we will see many intros using it at Revision.

The main feature of this release is the support for HLSL obfuscation, so we’ll now call this tool “Shader Minifier”. I got pretty good results with it, and I’ve been able to reduce the size of some famous 4k intros. Renaming strategy has been improved a bit; Detailed statistics will come later.

Try it now, download Shader Minifier!

Changes
  • HLSL Support. Please use –hlsl flag.
  • in/out now behave like uniform on global values (requested by several people), you can choose to preserve them.
  • Information on console output is removed unless you use -v (verbose).
  • Spaces in macros are now stripped (thanks to @lx!).
  • New flag –no-renaming to prevent from doing renaming.
  • Various fixes and improvements.