Author Archives: Zavie

Tunnel scene

New release: I – Probe

After the big endeavour that H – Immersion was, we needed a little break and do something less ambitious. Have more fun with less work. See if we would be able to make something decent within a just a handful of weekends.

Moreover, we were curious to see what we could do now with that shiny new engine. A significant part of it was developed while working on the intro, so we still needed to try to use it and use it only. Focusing on content without without adding features sounded like a good test to spot parts of the engine that maybe were problematic.

By the way, we already had a couple of new features that we hadn’t tested in a production yet. We wanted to make sure that they would work in a real production and not just on a prototype test scene.

Finally, our musician also wanted to experiment outside of the constraints of a 64kB. As we mentioned before, extreme size intros make content creation harder, because you cannot use your usual toolchain and workflow. For example, as a musician, you cannot use your sound samples.

So those were our constraints this time: a demo with no size limitation, just two or three weekends of work, try to spot problems in the engine (and fix them), but no new features.

After five or six weekends (spread over nine months), we released I – Probe in early November, at Alchimie. The title refers to the main feature showcased in the demo: the use of light probe for illumination. We consider the objectives fulfilled.

Talk at SIGGRAPH Asia 2018

We are proud to announce that we will be at the computer graphics conference SIGGRAPH Asia 2018 this December, where we will present the techniques used to create our 64K intro, H – Immersion.

At the conference, the “Computer Animation Festival” celebrates storytelling and animation in general, and showcases some of the best works of the year. We are honoured to have been selected among the talks there, and still in disbelief to be sitting next to talks about Pixar’s Incredibles 2 or Solo: A Star Wars Story.

If you are attending SIGGRAPH Asia this December in Tokyo, come to our session on Thursday 6th of December, from 16:15 to 18:00, in room G502 (glass building, fifth floor). All the details are available on the SIGGRAPH Asia 2018 session description. There is an iCalendar file as well.

Demoscene session at SIGGRAPH 2018

Tomasz Bednarsz has been trying to increase the presence of demoscene at the major graphics community conference, SIGGRAPH, for a few years now, through so called “Birds of a Feather” sessions. This year I had the unexpected opportunity to attend SIGGRAPH in Vancouver, and I was invited to participate to the session along with a few other sceners.  The details are available on the description that Tomasz posted.

There, I presented some aspects of 64k creation, that Laurent and I have been discussing here in the recent articles. The slides are available here:

Making an animation 18 bytes at a time.

A recording of the entire session is available. It includes the introduction by Tomasz, a presentation of a technique to render clouds in real time by Matt Swoboda (Smash, of Fairlight), our part, another take on 64k creation by Yohann Korndörfer (cupe of Mercury), and a presentation of Tokyo Demo Fest by Kentaro Oku (Kioku, of SystemK).

The event was way more successful than any of us expected, and we were all gladly surprised to see so much interest from the graphics community. A lot more people showed up than the room could accommodate, meaning that unfortunately most of them had to walk away.

The waiting line for the Birds of a Feather session on demoscene, at SIGGRAPH 2018.

Hopefully this increased interest means we can expect more events like this to happen at SIGGRAPH in the future years. We are already planning to do another demoscene session at SIGGRAPH Asia 2018, which will take place in Tokyo on December 4th to 7th.

A damaged mosaic texture

Texturing in a 64kB intro

This article is the second part of our series on the making of H – Immersion. You can read the first part here: A dive into the making of Immersion.

When making an animation within only 64kB, using images is tricky. We can’t store them in a traditional way, because it is not efficient enough, even with a compression like JPEG. An alternative solution is procedural generation. It consists in using code to describe how to create the images at runtime. Our implementation of such a solution is the texture generator, a core part of our toolchain. In this post we will present how we designed it and how we used it in H – Immersion.

Seafloor scene

The spotlights of a submersible reveal details of the seafloor.

Early version

Texture generation has been one of the earliest elements of our code base: our first intro, B – Incubation, already had procedural textures. The code consisted in a set of functions to fill, filter, transform and combine textures, and one big loop to go over all the textures. Those functions were written in plain C++, but were later exposed with a C API so they could be evaluated by a C interpreter, PicoC. At the time, we were using PicoC in an effort to reduce iteration time: in this case it allowed to modify and reload the textures at runtime. Limiting ourselves to the C subset was a small price to pay for the ability to change code and see the result without having to quit, compile and reload the entire demo again.

Steps for creating a wood texture

With a simple pattern, some noise and some deformation, we can obtain a stylized wood texture.

Desk scene

Various wood textures are used in this scene from F – Felix’s workshop.

We explored for a while what we could do with that generator, and ended up putting it on a web server with a small PHP script behind a simple web interface. We would write texture code in a text field, the script would feed it to the generator, which would then dump the result as a PNG file for the page to display. Soon enough, we found ourselves doodling from the office during lunch breaks and sharing our little creations among group members. This interaction was very motivating for creativity.

An online gallery of procedural textures

Our old texture generator web gallery. All the textures were editable in the browser.

A complete redesign

For a long time the texture generator almost didn’t change; we thought it was fine and our efficiency plateaued. Then we woke up one day, and discovered that Internet forums were suddenly full of artists showing off their 100% procedurally generated textures and challenging each other with themes. Procedural content used to be a demoscene thing, but Allegorithmic, ShaderToy and the likes had now made it accessible to the crowd while we had not been paying attention, and they were beating us hard. Unacceptable!

previous arrow
next arrow
Slider

It was long due time to reevaluate our tools. Fortunately working with the same texture generator for several years had given us time to understand its flaws. Our nascent mesh generator was also giving us some additional perspective on what we wanted a procedural content pipeline to look like.

The most important architecture mistake was the implementation of generation as a set of operations on textures objects. From a high level perspective, it may be a correct way of viewing it, but at the implementation level, having functions like texture.DoSomething() or Combine(textureA, textureB) has severe drawbacks.

First, the OOP style requires to declare those functions as part of the API, no matter how simple they are. This is a major problem because it doesn’t scale well and more importantly, it creates friction in the creation process. We don’t want to change the API every time we try something new. It makes experimentation more difficult, and ultimately limits artistic creativity.

Second, in terms of performance, it forces to loop over texture data as many times as there are operations. It wouldn’t matter too much if those operations were expensive relative to the cost of accessing large chunks of memory, however that’s usually not the case. Except for a few operations like generating a Perlin noise or doing a flood fill, most are in fact very simple and require few instructions per texture point. This means we keep traversing texture data to do trivial operations, which is ridiculously cache inefficient.

The new design addresses those issues with a simple reorganization of the logic. In practice, the majority of the functions just do the same operation for each element of the texture, independently. So instead of writing a function texture.DoSomething() which goes through all the elements, we can write texture.ApplyFunction(f) where f(element) only works on a single texture element. f(element) can then be written ad hoc for a specific texture.

This seems to be a minor modification. Yet doing so simplifies the API, makes the generation code more flexible and more expressive, is more cache friendly and trivially parallelizable. Many of you readers will probably recognize this as being essentially… a shader. Although the implementation is still, in fact, C++ code running on the CPU. We also keep the ability to do operations outside of the loop like before, but we only use that option when it is relevant, for example when doing a convolution.

Before:

// Logic is at the texture level.
// The API is bloated.
// The API is all there is.
// Generation of a texture has many passes.
class ProceduralTexture {
  void DoSomething(parameters) {
    for (int i = 0; i < size; ++i) {
      // Implementation details here.
      (*this)[i] = …
    }
  }
  void PerlinNoise(parameters) { … }
  void Voronoi(parameters) { … }
  void Filter(parameters) { … }
  void GenerateNormalMap() { … }
};

void GenerateSomeTexture(texture t) {
  t.PerlinNoise(someParameter);
  t.Filter(someOtherParameter);
  … // etc.
  t.GenerateNormalMap();
}

After:

// Logic is usually at the texture element level.
// The API is minimal.
// Operations are written as needed.
// Generation of a texture has a reduced number of passes.
class ProceduralTexture {
  void ApplyFunction(functionPointer f) {
    for (int i = 0; i < size; ++i) {
      // Implementation passed as a parameter.
      (*this)[i] = f((*this)[i]);
    }
  }
};

void GenerateNormalMap(ProceduralTexture t) { … }

void SomeTextureGenerationPass(void* out, PixelInfo in) {
  result = PerlinNoise(in);
  result = Filter(result);
  … // etc.
  *out = result;
}

void GenerateSomeTexture(texture t) {
  t.ApplyFunction(SomeTextureGenerationPass);
  GenerateNormalMap(t);
}

Parallelization

Generating textures takes time, and an obvious candidate for reducing that time is to have parallel code execution. At the very least, it is possible to generate several textures concurrently. This is what we did up to F Felix’s workshop and it greatly reduced loading time.

However, doing so doesn’t shorten generation time where we most want it. Generating a single texture still takes as much time. That affects editing, when we keep reloading the same texture again and again between each modification. It is preferable to parallelize the inner texture generation code instead. Since the code now essentially consists in just one big function applied in a loop to each texel, parallelization becomes very simple and efficient. The cost of experimenting, tweaking and doodling is reduced, and that directly impacts creativity.

A damaged mosaic texture
A mosaic texture
previous arrow
next arrow
Slider

This illustration is an idea that we explored and abandoned for H – Immersion: a mosaic decoration with orichalcum lining. It is shown here in our live editing tool.

GPU side generation

In case it isn’t completely clear in the paragraphs above, texture generation is done entirely on the CPU. At this point some of you might be staring at these lines with incredulity and thinking: “But, why?!”. Generating textures on the GPU would seem like the obvious thing to do. For starters it would likely speed up generation by an order of magnitude. So, why?

The main reason is that it was a smaller step of redesign to stay on CPU. Moving to GPU would have been more work. It would have required to solve additional problems, new problems we don’t have enough experience with yet. On CPU we had a good understanding of what we wanted and how to fix some of the earlier mistakes.

The good news however, is that with the new design it now seems fairly trivial to experiment with GPU side generation as well. In the future, testing combinations of both could be an interesting path to explore.

Texture generation and physically based shading

Another limitation of the old design was that a texture was considered to be just an RGB image. If we wanted to generate more information, say, a diffuse texture and a normal texture for a same surface, nothing was preventing us from doing that, but the API wasn’t actively helping either. This takes special importance in the context of Physically Based Shading (PBR).

In a traditional non-PBR pipeline, surfaces typically use color textures in which a lot of information is baked. Those textures often represent the final appearance of the surface: they already have some volume, the crevices are darkened, and they may even have some reflection highlights. If more than one texture is used at a time, it’s usually to combine details of large and small scale, to add normal mapping, or to represent how reflective the surface is.

In a PBR pipeline on the contrary, surfaces tend to use sets of different textures that represent physical values rather than a desired artistic result. The diffuse color texture, which is the closest to what we commonly describe as “the color” of a surface, typically looks flat and uninteresting. The specular color is dictated by the surface index of refraction. Most of the detail and variety come from the normal and the roughness textures (which you could argue represent the same thing, but at two different scales). How reflective the surface feels just becomes a consequence of the roughness. At this point, it makes sense not to think in terms of textures anymore, but in terms of materials.

Greetings marble floor texture breakoff
Cobbles textures breakoff
Fountain scene
Seafloor textures breakoff
Seafloor scene
Old stone textures breakoff
Arch scene
Submersible body texture breakoff
Launch scene
previous arrow
next arrow
Slider

The current design allows to declare arbitrary pixel formats for textures. By making it part of the API, we can have all the boilerplate taken care of. Once the pixel format is declared, we can focus on writing the creative code, without spending additional effort on processing that data. Upon execution, it will generate several textures and upload them to the GPU, transparently.

Some PBR workflows don’t directly expose diffuse and specular colors, but instead a “base color” and a “metalness” parameter, which have some advantages and some disadvantages. In H – Immersion we use a diffuse+specular model, and a material usually consists of 5 layers:

  1. Diffuse color (RGB; 0: Vantablack; 1: fresh snow).
  2. Specular color (RGB: proportion of reflected light at 90°, aka. F0 or R0).
  3. Roughness (A; 0: perfectly smooth; 1: rubber like).
  4. Normal (XYZ; unit vector).
  5. Relief elevation (A; used for parallax occlusion mapping).

When it was used, emissive detail was added directly in the shader. It didn’t seem necessary to have ambient occlusion either since most scenes didn’t have ambient light at all. It wouldn’t be surprising to have such additional layers though, or other kind of information like anisotropy or opacity for example.

Wall texture without ambient occlusion
Wall texture with ambient occlusion
previous arrow
next arrow
Slider

Pictured here is a recent experiment at generating local ambient occlusion based on the height. For each direction, march a given distance and keep the biggest tangent (height difference divided by distance). Finally, compute occlusion from the average tangent.

Limitations and future work

As you can see, the current design is a strong improvement over the previous one, and it provides creative expressivity. However, it still has limitations that we would like to address in the future.

For example, although it wasn’t a problem for this intro, we noticed that memory allocation could be an obstacle. The generation of a texture uses a single array of floats. For large textures with many layers, this can quickly hit the point where allocation fails. There are various ways to address this, but they all come with drawbacks. For example we could generate the textures tile by tile, which would scale better, but some operations like convolution would become less straightforward to implement.

Finally in this article despite using the word “material”, we have only talked about textures and never about shaders. Yet a material should arguably encompass the shading part as well. This contradiction reflects the limitation of our current design: texture generation and shading are two distinct parts, separated by a bridge. We have tried to make that bridge as simple to cross as possible, but what we really want is to treat the two as a whole. For example, if a material has static features as well as dynamic ones, we want to describe them in a same place. This is a difficult topic and we don’t know yet what could be a good solution but, let’s go one step at a time.

A doodle after Imadol Delgado's texture

An experiment in trying to create a fabric texture similar to the earlier texture by Imadol Delgado.

Temple of Diana - Transverse section

A dive into the making of Immersion

This article is also available in Russian, courtesy of Vlad Brown:
Погружение в создание Immersion

At last. Last December, we finally finished it. This video here is our last production, a 4 minute animation called “Immersion”. To be more precise, it’s a capture of what is usually referred to as a 64k intro. But more on that later.

Making it took the better part of two years’ free time. It all started during Revision 2015, a large event that takes place every year in Germany, during the Easter weekend. The both of us were chatting on our few kilometers long walk from the hotel to the party place, our faces battling the brisk morning air and the sleep deprivation. The previous night, the level of the 64kB competition had been high. Really high. The long established Hungarian group Conspiracy was finally back with a serious bombastic entry. Our best enemy Approximate was perfectly on time for its three years release cycle and showing a great deal of improvement in storytelling. The prolific Mercury now had a mature design style, with a foreshadowing intro title that left no doubt on the showdown.

That year, coming empty handed, we were not part of the competition, but we sure wanted to get back as soon as possible. Yet, after such a show we were wondering: slick look, great storytelling, great design… how could we get to that level? I couldn’t see what concept that, even perfectly executed, would have been a clear winner over any of those three. Not to mention that our tech was below any of them. And so there we were, throwing ideas on Hohenzollernstraße, when finally one of them stuck. A city rising out of the sea. That was a concept that, well executed, could maybe stand a chance at competing at the level this subset of that subculture had become. Revision 2016, get ready, here we come!

Revision 2016 zoomed past us with a whooshing sound… Revision 2017 it would be then. Alas, we barely made it to this new deadline either. At the party when people asked how it was going, the answer was a witty “It took us a year to make the first half, I’m confident we can make the second half in 24 hours”. We couldn’t. We did release though, but that second half was rushed, and it showed. So much so that we didn’t get even close to the podium. But we worked on it, gave it the love we thought it needed, and at last released the final version shown above.

What’s a 64k intro?

Demos are digital art creations at the crossroad of short films, music videos and video games. Although they present a non interactive experience, often music driven, like a music video does, they are rendered in real-time like video games are.

64kB intros, 64k for short, are like demos but with an added arbitrary limitation on the size: they must fit entirely within a single binary file of no more than 65536 bytes. No extra assets, no network, no extra libraries: the usual rule is that it should run on a freshly installed Windows PC with up to date drivers.

But how big is that exactly? Here are some comparison points.
In a 64kB file, you could store either:

  • 400ms of wave sound with CD quality, or
  • 3s of mp3 at 192kbps, or
  • A 200×100 RGB .bmp image, or
  • A JPEG picture of medium size, medium quality, like this 800×450 screenshot from the intro:

64kB screenshot

A 65595 bytes JPEG image, 59 bytes over the 64kB limit. :)

Yes, you’ve read that right: that video embedded at the beginning of this post, fits entirely within a single file that takes no more space than a just a screenshot from the video itself.

When you see these numbers, it seems complicated to fit in the binary all the images and sounds that surely must be necessary. We talked previously about some of the compromises we make and some of the tricks we use in order to make everything fit within such a small size. But that is not enough.

In fact, because of these extreme constraints, usual techniques and tools cannot be used. We wrote our own toolchain instead, a task that is an interesting challenge in itself: we create textures, 3D models, animations, camera paths, music, etc. thanks to algorithms, procedural generation and compression. We’ll talk about those very soon.

Some numbers

Here is an overview of how those 64kB are spent:

  • Music: 12.4kB
  • Meshes: 12.5kB
  • Textures: 4.8kB
  • Camera data: 1.3kB
  • Shaders: 6.2kB, from 5k lines of code
  • Engine: 12.9kB, from 20k lines of code
  • Intro itself: 12k lines of code
  • Time spent: hours, maybe over a thousand of them

Repartition of the binary space usage

This chart shows how the 64kB are used by the different type of content, after compression.

Evolution of the binary size

This chart shows how the binary size (excluding ~2kB of depacker) evolved until the final release.

Design & Inspiration

Having agreed that the central theme was a submerged city, one of the early questions was: how was this city going to look? Where was it located, why was it submerged, what was its architecture? One simple answer addressed all these points: it could be the legendary lost city of Atlantis itself. This would also explain and justify the emergence: by its divine nature (a literal deus ex machina). And thus it was so decided.

Concept art of Atlantis

An early concept art for the emerged city. The artworks shown in this article were created by Benoît Molenda.

Two books guided our design decisions: Timaeus and Critias, in which Plato describes Atlantis and its fate. In Critias in particular, he details the structure of the city, its colors, its abundance of the precious orichalcum (which became an essential element in the temple scene), its circular shape, and the main temple dedicated to Poseidon and Cleito. Since Plato apparently based his description on countries he knew, a mix of Greek, Egyptian and Babylonian styles, we decided to stick with these.

Concept art of city detailsConcept art of other city details

Without proper knowledge of the topic though, creating convincing antique architecture seemed challenging. Instead, we decided to reproduce existing buildings:

Searching reference material for the Artemision turned out to be an unexpected, enriching experience. Originally, we were only looking for photographs, schemes or maps for reference. But when we learned about the name “John Turtle Wood”, things took a greater depth. Wood was the very person in charge of the searches and ultimate discovery of the temple location. Hoping that his name would yield better results than merely “Artemision”, we followed up, and we immediately found the book he wrote in 1877, in which he reports not only descriptions and drawings of the temple, but also his eleven years journey to find the lost site, his negotiations with the British Museum to stay funded, his relations with the local workers and the diplomacy involved before randomly digging holes.

Those books were essential to the design decision but above all, reading them brought us, as individuals, so much value from making this project.

Temple of Diana - PlanWork in progress Artemision mesh

Temple of Diana - Transverse sectionTemple of Diana - Longitudinal section

And by the way, how is the roof supposed to look like? Some representations, including Wood’s, have a hole in it and some do not; there is apparently some controversy. We decided to go with an open roof model, allowing us to reveal the interior of the temple with a beam of light. The illustrations above show the floor plan and the cross sections, from the book Discoveries at Ephesus, compared to our work in progress model of the temple.

Achieving the desired look

We knew from the beginning that the appearance of water would be crucial to this intro. So we spent a lot of time on it, starting with watching reference material to understand the essential elements of underwater look. You might have guessed inspiration from James Cameron’s The Abyss and Titanic, 3DMark 11, or Ridley Scott’s Blade Runner for lighting.

Getting the right look wasn’t about implementing and turning on some epic MakeBeautifulWater() function. Instead, it was the combination of a series of effects that, when refined, would eventually trick us, the viewer, into accepting the illusion and feeling “This is it, we’re underwater!”. But one mistake, and the deception would collapse; a lesson we learned too late, when comments after the initial release pointed out where the illusion disappeared.

Concept art of the launch sceneConcept art of the underwater scenery

As illustrated above we also explored different non-realistic and sometimes extreme palettes, but we didn’t know how to achieve that look so we kept a classic color scheme in the end.

The water surface

Obelisk emergence scene

The rendering of the water surface assumes a flat plane reflection. Reflection and refraction are first rendered to separate textures, using cameras on one side and the other of the water plane. In the main pass, the water surface is rendered as a mesh with a material that combines reflection and refraction based on the normal and the view vector. The trick is to offset the texture coordinates based on the water surface normal in screen space. This technique is classic and well documented.

It works well at a medium scale like during the boat scene, but at a larger scale like in the final emergence scene, the result looks artificial. To make it believable, an artistic trick we used was to apply a Gaussian blur to the intermediate textures. Blurring the refraction texture gives a murky look to the water, and a greater sense of depth. Blurring the reflection texture helps make the sea look more choppy. Moreover, applying more blur in the vertical direction imitates the vertical trails expected from a water surface.

Emergence scene

A blurred image of the temple is reflected on the water surface.

The animation is done using simple Gerstner waves in the vertex shader, adding 8 of them with random directions and amplitude (within a given range). Smaller scale details are done in the fragment shader, including 16 more wave functions. A fake back-scattering effect based on normal and height brightens the tip of the waves, visible in the image above as small turquoise patches. During the launch scene, a few additional effects are added, like this rain drop shader.

Volumetric lighting

“How to make shafts of light for the submersible?” was one of the early technical questions. Maybe a translucent billboard with a beautiful shader could work? One day, we started experimenting with naive ray marching through a medium. We observed with delight that even in an early crude rendering test, and despite coder colors and the lack of a decent phase function, the volumetric lighting was immediately convincing. At that point, that initial billboard idea disappeared, never to be heard of ever again.

With this simple technique, effects we didn’t even dare think of where already baked in. As we added the phase function and played with it, it started to feel like the real deal. This opened a lot of possibilities from a cinematography point of view. But then there was performance.

Temple scene

Light shafts give this scene a look inspired by the film Blade Runner.

It was time to turn that prototype into a real effect, so we documented ourselves, read Sébastien Hillaire’s tutorial, his DICE presentation, and other approaches like the epipolar coordinates ones. In the end we settled with a simpler technique close to the one used in Killzone Shadow Fall (video here) with a few variations. The effect is done in one full screen shader at half resolution:

  1. On each pixel, a ray is cast, and its intersections with each light cone are solved analytically.
    The math is described here (now guess on what occasion the article was written in the first place ;-) ). In terms of performance, it would probably be more efficient to use a light volume bounding mesh, but for a 64k it sounded simpler to use an analytic approach. Obviously, rays only go as far as the depth in the depth buffer.
  2. In case the ray intersects, the volume inside the cone is then ray marched.
    The number of steps is limited for performance reason, and they are randomly offset to remove banding. This is a typical case of trading banding for noise, visually less questionable.
  3. At each step, the shadow map corresponding to the light is fetched, and light contribution is accumulated according to a simple Henyey – Greenstein phase function.
    Unlike epipolar coordinates based approaches, using this technique it is possible to have heterogeneous medium density, which adds more variety, but we didn’t implement such an effect.
  4. The resulting image is upsampled using a two passes bilateral Gaussian filter and added on top of the main render buffer. Unlike Sébastien’s tutorial, we don’t use temporal reprojection; we just use a high enough number of steps to reduce visible artifacts (8 steps in low quality settings, 32 steps in high quality settings).

Altar scene

Volumetric lighting makes it possible to give a mood and a distinctive cinematic look that would be difficult otherwise.

Light absorption

An immediately recognizable aspect of an underwater image is absorption. As objects get distant, they become less and less visible, their colors fading into the background, until they disappear completely. Similarly, the volume affected by light sources is reduced as light is quickly absorbed by the water medium.

This effect has great potential for cinematography, and modelling it is simple. It is done with two steps in the shader. A first step applies a simple absorption function to the light intensity when accumulating the lights affecting an object, therefore modifying the light color and intensity when it reaches surfaces. A second step applies the same absorption function to the final color of the object itself, thus modifying the perceived color depending on the distance from the camera.

The code roughly follows this logic:

vec3 lightAbsorption = pow(mediumColor, vec3(mediumDensity * lightDistance));
vec3 lightIntensity = distanceAttenuation * lightColor * lightAbsorption;

vec3 surfaceAbsorption = pow(mediumColor, vec3(mediumDensity * surfaceDistance));
vec3 surfaceColor = LightEquation(E, N, material) * lightIntensity * surfaceAbsorption;

Light absorption test

Test of light absorption in the water medium. Notice how color is affected by the distance from the camera and the distance from the light sources.

Adding vegetation

Seaweeds were an element we weren’t certain we could use. When reviewing the typical features of an underwater scenery, they were sitting among the top elements in the wish list, but their implementation seemed risky. Organic elements like that can be difficult to get right, and getting them wrong could break immersion. They would need to have a believable shape, be well integrated in their environment, and they might even require some subsurface scattering shading model.

One day though, we felt inspired to experiment. Starting from a cube, scaling it, and putting a random number of them on a spiral around an imaginary trunk: from far enough it could pass as a long plant with many small branches. After adding a lot of noise to deform the model it was already starting to look half decent.

Vegetation early test

A test shot with a few sparse plants.

However as we tried adding those plants to a scene, we realized the performance tanked rapidly with the number of objects. This limited way too much the number of them we could put for the image to look convincing. It turns out our new unoptimized engine was already hitting a first bottleneck. So we implemented a crude ad hoc frustum culling at the last minute (in the final version a proper culling is used :) ), allowing the dense bushes visible in the demo.

With appropriate density and sizes (patches with normal distribution), and the details taken care of by the dim lighting, it was starting to look interesting. Experimenting more, we tried to animate them: a noise function to modulate the intensity of an imaginary underwater stream, an inverse exponential function to make the plants bend, and a sinus so their tip would swirl in the stream. Doodling some more, we stumbled upon the money shot: the submersible casting a light through the bushes, drawing shadow patterns on the seafloor as it passed off camera.

Underwater vegetation

The vegetation casting shadow patterns on the seafloor.

Giving volume with particles

Particles are the final subtle touch. Pay close attention to any real underwater footage and you will notice all sorts of suspended matter. Stop paying attention and it disappears. We tuned particles to be barely noticeable, preventing them from getting in the way. Yet they give a sense of volume filled with a tangible medium, and help sell the look.

The technical side is fairly straightforward: in Immersion, particles are just instanced quads with a translucent material. The rendering order problem due to translucency was simply avoided by setting the position along one axis according to the instance id. By doing so, they are always drawn in the correct order along that axis. The particles volume then just has to be oriented properly for each shot. In fact, in many shots this is not even done at all, since the size of the particles and the darkness of the scene made noticeable artifacts rare enough.

Viaduct discovery scene in

In this shot, particles provide depth cues and a sense of density as the submersible descends.

Music

How to fit a high-quality music in around 16kB? This problem is not new, and most 64kB intros written after .the .product in 2000 use the same concepts. The original series of articles is old, but still relevant: The Workings of FR-08’s Sound System.

In short, the idea is that we need the music sheet and a list of instruments. Each instrument is a function generating a sound procedurally (see for example Subtractive synthesis and Physical modelling synthesis). The music sheet represents the list of notes and effects to apply. It is stored in a format similar to midi, with some changes to reduce the size. During the execution of the program, the music is generated.

The synth has also a plugin version (VSTi) that the musician can use from his favorite tool. Once the music is composed, the musician clicks on a button, which will export all the data to a file. We embed the data in the demo.

When the demo is run, it starts a thread to generates the music in a giant buffer. The synth is CPU intensive and is not guaranteed to be real-time. This is why we start the thread before the beginning of the demo, while the textures and other data are generated.

Daniel Lindholm composed the music, using the synth 64klang created by Dominik Ries.

Workflow

Iteration time is one of the most critical aspects of the workflow when making a demo. In fact, this is true of many creative processes. Iteration time is king. The faster you can iterate, the more you can experiment, the more variations you can explore, the more you can refine your vision and increase the overall quality. So we want to eliminate as much as possible all the obstacles, all the pauses, all the little frictions in the creation process. Ideally, we want to be able to change anything, any time, and see the result immediately, as a continuous feedback while we are still making the change.

A possible solution, used by many demo groups, is to build an editor and create all the content inside the editor. We didn’t. Our initial approach was to write C++ code and do everything inside Visual C++. Over time, we developed a number of techniques to improve the workflow and reduce iteration time.

Hot reload all the data

If there was only one single advice to take away from this article, it would be this: make all your data hot reloadable. All of it. Make it so you can detect when the data is changed, load the new data when that happen, and update the state of your program accordingly.

One by one, we have made all our data hot loadable. The shaders, the camera, the editing, all the curves that depend on time, etc. In practice, we generally have an editor and the demo running on the side. Whenever we modify a file, the changes are immediately visible in the demo.

In a project as small as a demo this is fairly simple to implement. Our engine keeps track of where the data comes from, and a small function checks regularly if the timestamps of the corresponding files have changed. If they do, it triggers a reload of the corresponding data.

It might be significantly more involved in a bigger project where such changes are made difficult by complex dependencies and legacy design. But the impact it has on production cannot be overstated, so it is well worth the effort.

Tweakable values

Reloading data is all well and good, but what about the code itself? This is more complicated and we have approached this problem step by step.

The first step was a clever trick that allows to change the constant literals. Joel Davis described it in a post: a short macro that turns a constant into a variable with a piece of code that detects when the source file is modified, and updates the variable accordingly. Obviously in the final binary, this additional code is absent and only the constant is left. The compiler is therefore able to do all optimizations (for example when the constant is set to 0).

This trick is limited but it is really simple and can be integrated in the code in a matter of minutes. Moreover, although it is only meant to tweak constants, it can still be used for debugging purposes to modify a code path or toggle features with conditions like if(_TV(1)).

C++ recompilation

Finally our most recent update in our quest to make the code more malleable has been the inclusion of the tool Runtime Compiled C++ in our codebase. By compiling the code as a dynamic library and loading it, as well as doing a bit of serialization juggling, it allows to make changes to that code and see the result at runtime, without restarting the program or, in this case, the demo.

This is not perfect yet: the API is intrusive and constrains the design (classes have to derive from an interface), and compiling and reloading the code still take a few seconds. Yet the ability to make changes to the code logic inside the demo and see the result in situation enables a great deal of creativity. At the moment only our texture and mesh generators benefit from it, but in the future we want to extend it to the entirety of the “content” code.

To be continued

Here ends the first part of what will be a series of articles on the techniques used in H – Immersion. We’d like to thank Alan Wolfe for proof reading; you can check his many technical articles on his blog. In the next parts we will present in more details how the textures and the meshes are created.

Until then, feel free to ask any question or share your own experience.

Part 2: Texturing in a 64kB intro.

Launch scene

New release: final version of H – Immersion

After months of polishing, we’ve finally released the final version of our latest 64kB intro: H – Immersion. You can read the details, download the binary or just watch the captured video from the production page.

We’re also currently doing a write up to show some of the techniques involved in this intro, which we’re hoping to publish here soon.

F – Felix’s workshop to be shown at SIGGRAPH 2013

This is the breaking news that crashed into our mailboxes yesterday. The PC 64kB Intro we released last year at Revision, F – Felix’s workshop, has been selected to be shown at SIGGRAPH 2013 as part of the Real-Time Live! demoscene reel event.

Unfortunately we won’t be able to attend SIGGRAPH this year, but to have our work there is quite some awesome news.

Back from Revision

I don’t know if this is going to become some sort of tradition for us, but as a matter of fact, we attended all Easter parties since the creation of our group. This year was no exception, and we had a really great time at Revision.

Revision is the kind of party that is just big enough so even though at some point you think “Ok, I’ve met pretty much every one I wanted”, when you get home you realize how many people you wanted to meet and did not. It’s also the kind of party that is so massively awesome that when you get back to your normal life, you experience some sort of post-party depression, on top of the exhaustion, and you have to get prepared for when it strikes.

Sidrip Alliance performing at Revision

So we’ve been there, and this year we presented the result of the last months of work in the PC 64k competition. The discussion of the concept started back in May 2011, and we seriously started working on it maybe around August.

While Revision was approaching, rumors were getting stronger about who would enter the competition, how serious they were about it, and how likely they’d finish in time. It became very clear that the competition was going to be very interesting, but even though, it completely outran expectations. It even got mentioned on Slashdot!

Our intro, F – Felix’s workshop, ended up at the 2nd place, after Approximate‘s gorgeous hypno-strawberries, Gaia Machina. The feedback has been very cheerful, during the competition as well as thereafter. Also, as if it was not enough, to our surprise, our previous intro, D – Four, has been nominated for two Scene.org Awards: Most Original Concept and Public Choice. Do I need to state we’re pretty happy with so many good news? :) Thank you all!

Now a week has passed already, we’re back at our daily lives, slowly recovering, and already thinking of what we’re going to do next. :) Until then, here is a capture of our intro:

Achievement unlocked

We have been taught that B – Incubation was nominated for the Breakthrough Performance category of the Scene.org Awards 2010. Needless to say this was an awesome news, that made our day (and probably the upcoming ones too; being nominated certainly does not mean actually winning the award, but that’s still quite something).

Whoever played a role in this: thank you very much!

As a side note, I’m pretty happy to see that most of the productions I hoped to see being nominated, were nominated. :-)

Blurry things

Our last production, E – Departure, is completely music driven and provides some fast camera moves. To make those visually more interesting we use a motion blur, that we happen to be fairly happy with. This article will be about explaining how this motion blur post processing effect is achieved. The shader language used in the code sample is GLSL.

E - Departure

Motion blur in real life

A photo with motion blur

A photo with motion blur

First, let’s take a moment to remind what motion blur physically is. When a photo or a video is shot with a camera, the exposure time (or shutter speed) is a parameter commanding how long the film, or sensor, will be exposed to incoming light. The longer it will, the more light will be caught by the sensor, thus the lighter the image will get, and the more blurry moving elements will get. This can be seen as a drawback (in a dark environment having crisp images gets more difficult) or as a wanted effect (since motion blur conveys a sense of speed).

While the shutter is open, the effect of light on the sensor is more or less constant: all things being equal, a light trail should have a constant lightness. In other words, it’s almost a box filter. I feel this point needs to be stated since some people confuse this effect with the artistic mean consisting of giving more contrast at the base of the trail than at the tail. In photography such an effect can be achieved by firing a flash right before the diaphragm closes (a technique known as rear curtain sync), but this is not the kind of motion effect I am referring to in this article.

Velocity map based motion blur

I’d say there are mainly three ways of achieving motion blur: stochastic based, accumulative based and velocity map based.

The stochastic approach consists in randomly sampling at different points in time and doing a proper weighted average. Trivial in ray tracing, this is difficult to achieve with rasterization but there is research in that direction. Insufficient sampling will lead to noise.

The accumulative motion blur consists in imitating the physical effect by blending together snapshots taken over the duration of the simulated aperture. Doing so is expensive since for each frame the geometry has to be processed and shaded. Insufficient sampling will lead to banding.

The velocity map based motion blur consists in rendering one image of the scene, no different from the regular rendering one would have without blur, and to make every point of this image bleed in some direction according to its speed. To do so, the speed of each point is computed and stored in a buffer: the velocity map. This technique will have the same banding artifacts as the accumulative approach, and some more. This is the technique we chose.

Making three educated steps away from reality

Bleeding is not motion blur

At this point of the description, the velocity map technique is already expected to give a result that is not matching reality. The reason simply being that having object colors bleeding over each other is not equivalent to the accumulation of light during an aperture. Let’s give a simple example to illustrate this.

Viewer's point of view

Imagine we have a black background, a static red sphere, and a moving white sphere passing in front of the red one. If the snapshot used for the final image is taken at the moment the white sphere is completely occluding the red one from the viewer point of view, the red sphere won’t contribute at all to the final image. With a camera taking a photo, there may be a part of the aperture duration when the red ball is still visible, hence contributing to the amount of light caught by the sensor. Accumulative motion blur will reproduce this effect while velocity map will not.

Expected resulting images

We know this is not accurate, but we are trading accuracy for speed, and as long as we don’t forget this and we accept the resulting quality this is ok. It is all a matter of balance between affordable and physically correct.

Linear motion versus arbitrary motion

Now another inaccuracy lies hidden in the velocity map technique description: the fact that we will make colors bleed in the direction of the speed. The wording suggests that the blur for a given point will always be linear. And this is what will be assumed, since it is way easier to store the representation of a linear move than any other kind of move. But then it means that points moving along a circle for instance, like with a rotation, will have a linear trail instead of a curved one. This approximation works as long as the amount of blur is supposed to be very limited compared to the path. But with elements making fast enough, like small circular moves, this will not work any more and glitches will become noticeable. Should this be a problem, I would suggest instead to store an instant center of rotation for example.

As long as we are using linear operations, we can also avoid computing the speed for each point, and instead just compute it for each vertex and interpolate between them. The benefit of doing so is obvious: the speed will be a varying value computed in the vertex shader, and available thereafter in the fragment shader.

Scatter versus gather

At last, let’s get even farther from reality for a technical reason, still trading accuracy for speed. As explained, we will have two images: a snapshot of the scene, and an image representing the speed of each point. Ideally, we would like to have each moving point bleed over its neighbors according to its speed. Unfortunately nowadays GPU are designed in a way so they can easily gather, but cannot scatter information: in a fragment shader, you can retrieve data from other fragments, but you cannot modify other fragments. So if you want to have one fragment affecting some other one, you have to read it from that fragment. And if you don’t know in advance which fragments are affected, which is the case here, for a given fragment you have to check all neighboring fragments and gather the ones affecting it. If a fragment might affect up to n fragments away, this mean having to check fragments, thus O(n²) (for more on this topic, see chapter 32 of the book GPU Gems 2). This quickly becomes very expensive.

So what we will do here is consider motion blur to be homogeneous enough so we can trade the scattering we would like to do for a simple gather operation. Instead of making our fragment bleed over, we will dilute it among other fragments. This is a strong assumption, that will often prove false. But fortunately, the average result will still look nice, even though some obvious artifacts will appear here and there.

Computing the velocity map

Let’s now get to the implementation part. I am supposing here we have some FBO (Frame Buffer Objects) ready, in order to be able to have efficient render to texture, or the equivalent if the API is not OpenGL. I won’t go into further details about this since it is quite a different topic from what is being discussed here. If you don’t know how to set up and use FBO, at least you now have the keywords and can Google it.

We will need a couple of things to build our velocity map. First we will need a separate buffer to store it. With OpenGL this is done with glDrawBuffers(), which allows to tell which buffers we are going to writen into. By default there is only one such buffer, but we can have various of them; here we just need an additional one. Second we will also need a way to compute the speed of the vertices. The motion blur is a screen space effect, so all we need is the speed in screen space.

During a typical forward rendering pass, each object will have some kind of matrice, pile of matrices, or any equivalent mean to compute its position in world coordinates. The camera is likely to have its own set of matrices, to transform points from world space to camera space, then from camera space to screen space (in OpenGL this last transformation is typically defined by the matrix referred by GL_PROJECTION_MATRIX). The position of a vertex at the given time t is therefore multiplied by each of those matrices to get the final position in screen space. To know the speed of such a vertex in screen space, we just need to know where it was at the time t – dt. If dt is exactly the duration of the aperture, the two positions will define the limits of the motion during that time. This is exactly what we want.

So basically we need a way to retrieve the transformation at t – dt while rendering at t. Using an animation, this shouldn’t be too difficult: you just have to query your system for t – dt. If your animation is ad hoc, like with some real-time interaction, then you may store the previous transformations. Anyway, as long as you have both the previous object transformation position and the previous camera position transformation, you have it all.

In a minimalistic vertex shader, one would have something like the following:

  1. void main()
  2. {
  3.   gl_Position = gl_ProjectionModelViewMatrix * gl_Vertex; // equivalent to ftransform(), which is a deprecated function
  4. }

For each object we will feed the shading pipeline with an additional information: the transformation matrix of t – dt. I considered that the projection matrix was unlikely to change fast enough to have any consequence on motion blur, so I decided not to store it, and to use the current one for both the old and the current position. It is up to you to choose your policy, but you have to be consistent when computing the speed in the end. Anyway, this leads to the following new vertex shader:

  1. uniform mat4 oldTransformation;
  2. varying vec2 speed;
  3.  
  4. vec2 getSpeed()
  5. {
  6.   vec4 oldScreenCoord = gl_ProjectionMatrix * oldTransformation * gl_Vertex;
  7.   vec4 newScreenCoord = gl_ProjectionMatrix * gl_ModelViewMatrix * gl_Vertex;
  8.   vec2 v = newScreenCoord.xy / newScreenCoord.w – oldScreenCoord.xy / oldScreenCoord.w;
  9.   return v;
  10. }
  11.  
  12. void main()
  13. {
  14.   gl_Position = gl_ProjectionModelViewMatrix * gl_Vertex; // No change here
  15.   speed = getSpeed();
  16. }

You may notice that the current position is computed twice (gl_Position and newScreenCoord). I let it this way to make the code easier to read, since it seems the shader compiler will optimize this anyway. Also, you have to be careful about the homogeneous coordinates operation.

From now on let’s retrieve the interpolated speed in the fragment shader and finally store it in the velocity map. A typical fragment shader would look like this:

  1. void main()
  2. {
  3.   gl_FragColor = /* whatever */
  4. }

We are simply modifying it the following way:

  1. varying vec2 speed;
  2. vec3 getSpeedColor()
  3. {
  4.   return vec3(0.5 + 0.5 * speed, 0.);
  5. }
  6.  
  7. void main()
  8. {
  9.   gl_FragData[0] = /* whatever the fragment color was */
  10.   gl_FragData[1] = getSpeedColor();
  11. }

This is it. At this point we have a velocity map where color represents the motion of each point in screen space. So far so good.

Update: Actually, so far, not so good. The above code contains a bug that I didn’t notice at the time because we rarely met the conditions to make it visible. But since then it has been bugging me as we have been working on a scene that exhibits it pretty badly.

When polygons get clipped, it may lead to broken speed values, resulting in annoying strong blur artifacts. The reason behind this is the non linear operation consisting in dividing by the w component in the vertex shader, that brings wrong values after clipping.

To solve this, I suggest not dividing in the vertex shader and pass the w component to the fragment shader in order to divide only at that time.

Using the velocity map and applying the blur

Once the color buffer and velocity map are filled, the post processing pass will use both to generate the motion blurred image. This is done very simply: while in a very minimal blitting shader you would have something like the following,

  1. uniform sampler2D colorBuffer;
  2.  
  3. void main()
  4. {
  5.   gl_FragColor = texture2D(colorBuffer, gl_TexCoord[0].xy);
  6. }

here it becomes

  1. uniform sampler2D colorBuffer;
  2. uniform sampler2D velocityMap;
  3.  
  4. vec4 motionBlur(sampler2D color, sampler2D motion, vec2 uv, float intensity)
  5. {
  6.   vec2 speed = 2. * texture2D(motion, uv).rg1.;
  7.   vec2 offset = intensity * speed;
  8.   vec3 c = vec3(0.);
  9.  
  10.   float inc = 0.1;
  11.   float weight = 0.;
  12.   for (float i = 0.; i <= 1.; i += inc)
  13.   {
  14.     c += texture2D(color, uv + i * offset).rgb;
  15.     weight += 1.;
  16.   }
  17.   c /= weight;
  18.   return vec4(c, 1.);
  19. }
  20.  
  21. void main()
  22. {
  23.   gl_FragColor = motionBlur(colorBuffer, velocityMap, gl_TexCoord[0].xy, 0.5);
  24. }

In this example, the inc value will define a 10 fetches motion blur. It proved to be sufficient in our case, and even as little as 8 fetches could be enough for moderated blur. With 20 fetches it becomes really hard to notice artifacts, but it is also slower of course.

The function argument intensity controls the length of the motion trails. You can set it to be consistent with your rendering speed, or low for a fainter effect, of high for an exaggerated effect.

Dealing with precision matter

At this point, I had a nice looking motion blur already, but I noticed it tended to introduce instability at low speeds. This is due to the fact that low speed means short velocity vector, which being stored with two integers loses precision both in speed and direction. To overcome this issue, I decided to represent differently the velocity vector.

First, instead of storing its components, I stored the components of the normalized velocity vector on one side, and the norm on the other side. This dealt with the direction problem.

Second, I applied the very same trick gamma correction relies onto: using the power function to increase the precision of low values, at the cost of precision for high values. The norm would then be stored in non linear space, and transformed back when using it. Since varyings are interpolated linearly, we have no choice but stay in linear space in the vertex shader, and change for non linear space only in the fragment shader.

So in the vertex shader, once we have our per vertex speed vector we store it this way:

  1. varying vec3 speed;
  2.  
  3. vec3 getSpeed()
  4. {
  5.   vec2 v = /* … */
  6.   float norm = length(v);
  7.   return vec3(normalize(v), norm);
  8. }

While in the fragment shader the color we write becomes:

  1. vec3 getSpeedColor()
  2. {
  3.   return vec3(0.5 + 0.5 * vSpeed.xy, pow(vSpeed.z, 0.5));
  4. }

During the postprocessing pass, the speed is then read this way:

  1. vec3 speedInfo = texture2D(motion, uv).rgb;
  2. vec2 speed = (2. * speedInfo.xy1.) * pow(speedInfo.z, 2.);

Update: again, for the reason mentioned above, this code has to be changed and norm has to be computed in the fragment shader. I am too lazy to fix the code here so this is let as an exercise to the reader. ;-)

Bugs and limitations

Motion blur artifacts

Motion blur artifacts

Just like said, each trade off introduces inaccuracies, that result in visual artifacts. The most important one is obviously the scatter vs gather one. You can see the visual cost implied here: notice the ghost effect visible on the edge of some elements.

Another problem is how to manage the edges of the image. When an object is moving near the border, the algorithm may try to fetch colors outside the image. In that case, I see three solutions: leaving it as is if this is fine in your case (it might simply not happen often enough to be a problem), clamping the fetch to the border (this is what is done in E – Departure; notice how it affects the right side of the sample screenshot), or generating a bigger image to have a thin space outside the displayed frame where colors can still be fetched.

Those problems though are difficult to notice at low speed, and will appear only briefly at higher speed. So for E – Departure, this was an acceptable trade off.

Conclusion

I presented here a technique fairly simple to implement, allowing to get an effect which is visually pleasing, and that noticeably increases the realism of the image for a limited cost. It played an important role in our demo; I hope you will find a good use for it too.