Voxel Raymarching
Categories: graphics
I recently stumbled upon some short clips of John Lin's path traced voxel renderer:
I found the visual style and effect really impressive and appealing (not to mention the beautiful design of the scene). Beyond the visuals, I found the demo technically impressive as well, since it demonstrated complex lighting effects and dynamic scenes using path-tracing. Watching this video made me want to learn about the techniques used to make the engine run at real-time, although I wasn't aiming for a demo as complete as John Lin's (since I started working on this project Lin has greatly advanced the capabilities of his engine, so check out the other videos he has posted). In this post, I hope to go over some components of the renderer I was inspired to engineer, although it is not yet complete.
Above is a screenshot of the images my engine can produce, although the level of quality and noise isn't currently acheived in real-time.
Voxels
Voxel data very quickly becomes extremely large, since the amount of data is proportional to the volume of the geometry, which is the cube of its linear size. Compare this to traditional triangle meshes, which use data proportional to the surface area of the geometry (square of linear size). Because of the potential size of the voxel data, many methods have been developed to compress it. The most common is to store the voxel data as a sparse voxel octree (SVO). The advantage of this is that large uniform areas (particularly empty space/air) are efficiently represented (no further subdivision in uniform volumes), and the result is that the amount of data is roughly proportional to the surface area between different materials (although it is still generally more than a mesh). I decided to use a further improvement on SVOs which deduplicates identical subtrees and stores the data as a sparse voxel directed acyclic graph (SVDAG). This compression scheme compresses large uniform volumes the same way as SVOs, but also handles repeated and similar geometry by having volumes with identical subvolumes point to the same data. However, the added compression can make modifications more complex or less efficient.
Path Tracing
Path tracing is a technique for computing an image a camera would create with physically correct lighting properties. The central idea is to sample many random paths that the light could take to get from a light source to your eye. Then, for each of these paths, calculate the color and intensity of the light that would enter the camera after the light reflects, refracts, or otherwise interacts with the scene along the path sample. The final computed light entering your eye is the average contribution of many paths computed in this fashion.
Path tracing can produce amazingly accurate images, but it is notoriously computationally expensive, and for complex scenes, producing an accurate render may take many times more samples than a simple scene. Besides low-level code and hardware optimization, many sampling optimizations have been introduced to decrease the number of samples required to calculate accurate images. One such technique is called (multiple-)importance sampling, which chooses paths that are more likely to be significant; for example, depending on the material, light will be reflected or refracted more from one direction than another, and paths that transmit more light through such an interaction will have higher representation in the samples. Another is Metropolis light transport, which applies applies the Metropolis-Hastings sampling technique to sample paths "near" or similar to previous paths that had large contributions to the calculated lighting.
Making it Real-Time
Path tracing has not been applied much in interactive settings because it is difficult to produce a high quality image with the number of samples that can be computed in a single frame. With complex scenes such as those seen in games, often only 1 path sample is practically possible between frames, which results in very noisy images. There are several techniques that can be used to reduce that noise in the images, which are known collectively as denoising algorithms. In general, these use path samples computed from previous frames as well as path samples from nearby pixels to artificially increase the number of samples for each pixel, resulting in a more accurate average.
Currently, I have implented a path tracing engine that produces noisy images. I have experimented with some basic denoising algorithms, but I plan on implementing a more complete and principled denoising algorithm next. The code for my engine is available on GitHub, but it is not fully documented, so it might not be very useful as a reference at the moment.