Creating a fusion of pixel-art and low-poly rendering
When I discovered the grainy 2D adventure »The Last Door« in 2014, I was thrilled by the dense atmosphere of this Lovecraftian horror story. However, I was wondering to what extent it would be possible to combine the sensory impressions of a hand-drawn retro pixel adventure with the latest 3D rendering techniques. As part of a brief experiment I built a cubist character in Blender and then used Unity to turn it into a playable prototype. The very promising first steps soon developed into a complete point & click adventure, the indie project »A Room Beyond«. Since then I get lots of questions about the technical details of the semi-three-dimensional pixel look. Hence, in the following I will try to explain which conceptual, creative and technical steps the »2.5D« raster graphics are based on and why simple post-processing isn't sufficient to recreate the charm of nostalgic raster graphics.
Why use pixel rendering at all?
At the time when early point & click adventures came up, like the great Lucas Arts games, the characteristic grainy image design was due to the technical limitations of the hardware at that time. Most backgrounds, sprites and animations were hand-drawn frame by frame, which added a charming look to it. But this technique also had its limitations: Everything that was visible was limited explicitly to what was drawn. Movements, e.g., could only be displayed in the perspective and frame rate of the single frames. Light effects like highlights or dramatic shadowing, too, were only possible through static pre-productions.
Irrespective of the technical possibilities and limitations, raster graphics can be an attractive visual effect, especially since it stimulates the player's power of imagination. Due to the fact that not every detail is visible but an abstracting mosaic appears on screen, there's still plenty of room for one's own interpretations. The same way a book you read leaves a lot of concrete details open unlike its screen adaption, your own imagination enhances the entire gaming experience.
Since today's computing power is sufficient to display effects like 3D movements and lighting models in real time, the idea of compensating the deficiencies of raster graphics with real-time effects springs to mind. Although this approach seems pretty obvious, only a few projects seem to have followed it so far.
It's not enough to simply reduce the pixels (downsampling) to get closer to the impression of manually drawn raster graphics, especially since this effect can easily be noticed as such. Instead, for A Room Beyond I have used a combination of several creative decisions and technical means in order to imitate the visual appearance of raster graphics. Image 1 sums up the most important components for the overall composition.
Each camera setting is based on a 3D scene created in Unity. Floors, walls and other elements relevant for spatial movements are based on low-resolution polygon meshes. Decorations, static background images and foreground templates are added as simply textured layers, which both increases the performance and allows the direct implementation of manually drawn areas.
A pixelation shader screens the camera image prior to the drawing of GUI elements. By making slight adjustments, the code of the open-source »Unity Face Sensor« can be used for such a purpose. The shader is of particular relevance for the display of visible pixels along the 3D object edges which again depend on the camera perspective (image 2).
The subsequent pixelation of saved high-resolution camera images took up quite a bit of computing power, which is why it was initially presumed that a direct rendering in low resolution and the subsequent enlargement to screen resolution might be more efficient. Indeed, a corresponding test implementation resulted in a significant improvement of the frame rate. However, some problems arose as well: The shadow calculation included inacceptable errors and artifacts; antialiasing procedures could not really be implemented in the rendering result. Also, the GUI elements which overlapped the pixelated camera picture required the insertion of another rendering step with the help of an additional camera. This double rendering then negated the savings effects gained compared to the pixel shader. Additionally, using this technical architecture also changed the evaluation of mouse input signals, which called for additional work-around code to imitate the usual processes. Since at the end of the day there were also limitations regarding the preview and interactivity of this process within Unity Editor, the final conclusion was that the use of the pricier pixelation shader compared to the direct rendering in low resolution had paid off simply for the sake of simplicity, flexibility and accuracy.
In order to increase the visibility of the pixels within forming areas, it was sufficient to use grainy textures. Textures scaled via the object surface and with very low resolution usually delivered good results. Also, the deactivation of Unity's import optimizations like smoothing filters and memory optimizations (»2-power correction«) added to the sharp pixel edges. On a creative level the visual plasticity of 3D objects could be enhanced by placing textures along their spatial course (image 3).
The visual attractiveness was also influenced by the context of the scene: Objects were not only to be textured from an isolated point of view, but from the perspective of the surrounding elements in the final scene. A balanced mixture of noise textures (like the wooden floor in image 1) and single-colored textures (e.g. the cabinet in image 1) often proved to deliver an overall appealing, cartoon-like picture.
The camera concept in A Room Beyond is based on static perspectives with hard cuts. Depending on where the player character is standing, the game chooses a camera to present the respective scene in the best way possible. One the one hand, these statics are owed to the use of the pixel shader which would lead to intensive jitter of all pixels when the camera moves due to the modified render result. On the other hand, the use of static cameras also allows a closer orientation towards image design concepts known from film theory. Directors like Alfred Hitchcock demonstrated in their work how the principle of the »Mise en Cadre« as a dramaturgical tool works: The placing of objects within the visible image frame does not just follow the currently active actors, but is mainly based on dramaturgical questions. Thus, unusual – maybe even deliberately unnatural – camera perspectives and lighting models support particularly dramatic moments and emotions of the narration. An iconic leitmotif of the »Alone in the Dark« series (1992 to 1994) is the view of the character entering the scene through a window. Although from a technical point of view there is no difference to the third-person perspective of the rest of the game, in this case the dramaturgical placement of the camera causes the player to leave their character for a short moment in their mind and slip into the skin of an unknown third person. Although the viewer isn't shown directly, the camera looking down increases the eerie feeling of being expected in the seemingly abandoned mansion.
Since A Room Beyond is also based on static camera perspectives, it's all the more important to express the spatiality of the scene through textures, object arrangement and lighting. Some lighting information, especially light edges, cast shadows and illuminating surfaces are already drawn directly into the textures (image 4).
The final atmosphere of a scene is mainly dependent on engine light sources additionally placed in the 3D scene. Usually baked in light maps, they help arrange light and shadow depending on the respective model geometry and the placing of all objects in the scene. Only a few light sources are used as such in real time where they illuminate movable objects like characters. Besides the dramaturgical staging, lights can also help the game engine emphasize the plasticity and brightness of 3D models (images 5 and 6). The rules of cinematographic lighting such as the three-point lighting model can be easily applied to the virtual stage. Light sources that are contrarily colored and arranged in opposing angles then render the spatial orientation of object surfaces visible through different coloring. Thus, three-dimensional objects appear more vivid in a two-dimensional picture.
In lots of cases, planes (or quads) with partially transparent textures and highly simplified grid objects are sufficient to generate the scene environment for gameplay with static camera perspectives. Photoshop's layer styles are an efficient tool for adding pixel details. Image 7 shows architectural objects created mainly through single-colored, hand-drawn planes. The use of layer styles allows for additional details like highlights, patterns and shadows. The big advantage of this procedure is that the result can still be edited. If, for example, the archway is to be changed, this can be simply and quickly done with pencil and eraser tools. All shadows, light edges, patterns and gradients automatically adjust to the new form.
The impression of raster graphics is generated by the use of low-resolution source textures which are scaled in the game engine on the one hand. On the other hand, layer styles like the pattern overlay can be used to add grainy structures like image noise or line patterns to any forms or areas. The noise property, too, which can be adjusted for shadow and glowing styles, adds to an increased pixel effect.
Image 8 shows a camera setting which is essentially based on flat texture layers. Splitting it up into several geometry objects is still necessary to enable the virtual characters to move in space and, e.g., to run around pillars (image 8, left). However, all details and decorations are generated through textures which are almost equivalent to the final image (image 8, center). Last but not least, the pixelation shader applied at the end (image 8, right) only aligns crooked pixels distorted by the perspective in a straight grid.
Limitations and challenges
The main risk arising from the combination of pixel textures and post-processing pixelation is the occurrence of interferences generating seemingly broken pixels. This happens especially when details are dependent on the display through individual pixels since the shader doesn't differentiate between creatively important and unimportant pixels. The details of the glass window shown in image 9 were all drawn into the texture. The different size of the perspectively scaled texture pixels and the resolution of the post-processing effect result in an unattractive split of the individual texture pixels into several smaller image points. Luckily, there's a simple solution to this problem: Since the phenomenon usually appears in zoomed in close-ups, the post-processing filter can simply be deactivated for the respective cameras (image 9, right). Hard camera cuts and a similar size of the pixels make it hard to spot a difference between subsequent rasterization and a scaled pixel texture. Unfortunately, this method doesn't always work. In some perspectives, the shader interpolates to blurred color stains (image 10) – as happens, e.g., with the eyes of the virtual characters, which are ideally generated with just a few individual pixels. Since set pixel patterns for selected model parts can hardly be created while maintaining the spatial transformation freedom, this is the price that needs to be accepted for this technical approach.
Camera movements, too, are limited in this way. In traditional, hand-drawn raster graphics, they often occur in the form of moved layers, with pixels moving only horizontally and vertically in the camera layer, not in depth. If, however, a perspective camera is moving in the room, the spatial perspective of all objects changes, resulting in all pixels not only being moved, but being perspectively re-calculated. Thus, the color of potentially each pixel changes, entailing a jitter of the entire camera image. The illusion of static raster graphics gets completely lost in this, and the downsampling of a high-resolution image is clearly perceptible for the player. Hence, camera movements should be limited to shifting the section of a full once-rendered and pixelated picture in order to avoid pixel jitter.
Benefits and usefulness
The benefits gained from this combination of preset raster graphics and the subsequent algorithmic pixelation are flexibility and dynamics achieved in the 3D room. Scenes can be set up spatially, thus simplifying the creation of different perspectives of the same location. If the props change, there's no need of any additional manual corrective work across several perspectives. It may be advisable to generate special effects like explosion lights simply through real-time lighting of the game engine. The edges along the object geometry will be included in the calculation of light and shadows just like bump map channels in the textures. Dynamic objects like characters can rotate, gesticulate and move freely in the scene, taking light and camera angle into account, without any additional implementation effort.
After all, the interaction also benefits from the spatial scene. If the player character is, e.g., on a hill overlooking the valley, the player can click on the valley to start walking there. Since longer distances in games with static framing usually run through several scenes, programming has to make sure that the player character walks through all intermediate screens between starting and end point. In case of the classic single-frame creation, this is an onerous process. However, since the approach of a semi-automatic pixelation is based on a 3D scene, algorithms for path finding can be used. When clicking on a point in the distance, the character automatically walks along the located path to the final destination, with triggers spread across the terrain switching to the adequate camera perspective.
Of course one could now argue whether the partially automated pixelation presented here is still a form of what is generally understood as »pixel art«. For advocates of the purest form, the manual editing of every single pixel by the artist is an essential characteristic of this creation style. Even simple filters and tools are frowned upon by some.
Although I absolutely think that semi-automatic pixelation can come close to the charm of purely hand-drawn raster graphics, some clearly visible differences still remain. For example, it's in the nature of rendering freely movable objects that blurry or unattractive image areas may appear, which is a clear disadvantage to hand-drawn pixel art. Algorithmic smoothing, be it in terms of anti-aliasing or movement calculations, always look a bit more artificial than the result of delicate craftsmanship.
Hence, this approach is not to be understood as an alternative way to do it. It's rather a style of its own, which stands apart from the classic pixel art precisely because of its differences. Still, a certain retro feeling arises, especially since games like Alone in the Dark or Silver already experimented with the combination of static image material and real-time 3D objects back in the pixel era of the mid-90s.