Somewhere around the end of last year I was working on a small private project to see how realistic and "next-gen" I could go with rendering a human being and of course also to push the engine to it's limits ;)
Now I've finally come around to publishing an article here on my blog on my findings, ideas and gained knowledge about the subject.
First of all a big applause and praise to Jorge Jimenez who has spent quite some time on this topic and has made his work publicly available (http://www.iryoku.com/stare-into-the-future), which has inspired and helped me a lot in my own work and was the basis for most of it.
So the SSS (SubSurface-Scattering) technique that I'm using here is the one proposed by Jimenez in his 2011 paper. If you want to know how it works in detail you can check it out on his blog. Basically it's a screen-space blur that is run as a post-effect in the pixel shader using a seperable blur kernel for more efficient execution.
Key here is to blur the diffuse only part of the image (marked by the stencil buffer) to smooth out high frequency normal map (or bump) Information and to apply a color tint that makes the Skin appear more realistic in the sense that it simulates subsurface scattering (duh!) that happens under the Skin (light penetrating the Skin, scattering around and reflecting out again).
This technique is quite common nowadays and performs fairly well.
For the specular part I'm using two-lobes of the GGX distribution added on top of each other using a Ratio of 70% for the regular more smooth part of the Highlight and 30% for a more high frequent highlight that makes the details of the skin pores come out stronger.
This is a very subtle visible effect but it does help to pronounce the detail a bit more.
Another (very) important part of skin rendering is (yet again) indirect lighting.
The biggest issue when using extremely high resolution normal maps is that subsurface-scattering only works on parts that are actually diffusely lit! Meaning if you take a very high res normal map and have the model lit with a harsh light you will notice the skin going back to it's initial "stone"-look because of the lack of SSS.
What this means is that it's really hard to get a great looking skin from all kinds of light setups because there are so many factors that need to work for it (indirect light intensity, camera exposure, direct light intensity and it's direction).
So what we have to do is always make sure that every part of the skin is lit in some way or another (idea: adaptable SSS? stronger on darker parts, weaker on bright parts maybe?).
Another thing that comes with indirect Illumination is shadowing.
Shadowing is absolutely crucial for believable skin rendering. Especially when it comes to image-based-lighting and ambient occlusion. But certainly soft and hard shadow edges directly from a light source itself can make quite a difference!
Applying AO as a form of environment reflection shadowing is a must, however what we can also do is add a fake SSS to the shadowed parts to make the skin come to live just a bit more.
In shader code this looks like the following:
float3 strength = float3(0.4f, 0.15f, 0.13f); // Fake Color-Bleeding for AO
float3 colorBleedAO = pow(abs(AO), 1.0f - strength);
Diffuse += DiffuseAmbient * colorBleedAO;
And that's pretty much it about Skin.
The only thing I'm missing at this point is translucency for nostrils and ears, however this could be approximated using the translucency shading using an inverted AO map (http://www.crytek.com/download/2014_03_25_CRYENGINE_GDC_Schultz.pdf)
In the next part I will be talking about Eye Rendering and it's (still ongoing) challenges.
Here are a few more pictures to Showcase the Skin shading:
I've finally come around to making this blog post about the new changes in the physically-based camera system in LipsEngine. As you might remember from my last post about this topic I was trying to simulate how a real camera works by using real-world parameters like shutter speed, ISO or aperture. However the issue with that previous approach was that even though I was using real-world parameters as an input. The calculations weren't done physically correct or based on anything but empirically choosen values that "fit" the output of a real camera. Which meant lots of random magical numbers and after all the end result wasn't that great. The new implementation calculates the CoC size based on real camera values which is much nicer. Another thing that I decided to change was the fact that I used the shutter speed, ISO & aperture to calculate the amount of exposure of the image had. At first it sounds plausible because that's how it works in real world cameras on manual setting but after working with it for a while I decided against it. The reason is that I just found it very unintuitive in a virtual simulation (3D-Engine) setting and opted for a more simple solution to have the regular auto-exposure via averaging luminance of the last few frames on plus a setting to manually shift the exposure in both directions if needed.
In my previous Depth of Field implementation I used an approach that extracted bright pixels depending on luminance and CoC size and added them to an append/consume structure, which I then rendered as bokeh sprites via geometry shader point-quad expansion. And the blur was then accomplished using a large pseudo separable disc blur.
You can find more on that technique here:
What I didn't quite like about the technique was that the bokeh shape always felt disconnected to the blur and I was having a hard time finding a setting that looked good to me. Since we can't go crazy on the disc blur size and the already very expansive bokeh sprite rendering pass is a huge performance hog with potentially tremendous amounts of overdraw. And we all know geometry shaders are a delicate thing when it comes to outputting huge amount of objects...
So, the new technique is a scatter-as-gather approach similar to what can be found in UnrealEngine 4. It
makes use of a very clever way of rendering bokeh, namely using multiple box filter blur that are skewing into a certain direction to shape the image into a hexagon. This works very efficiently because the blur itself isn't that expansive and we now have a blur that is actually shaped like the aperture just like it would in real life cameras. Another thing is that since the bokeh shape appears in form of our blur, we can avoid having to draw them as extra sprites on top of it so the issue of overdraw is none existent in comparison to the previous technique.
To further increase the performance I'm also downsampling to 1/2 Resolution during the box blurs using a gaussian filter to smooth out the transition between blurred (half res) and unblurred (full res) regions.
Here are a few pictures of it in action:
f/22 @ 210mm
f/5.6 @ 210mm
f/1.4 @ 210mm (little out of focus)
f/1.4 @ 210mm (changing focus to make it completely out of focus)
f/1.4 @ 210mm (focus on the back of the gun + higher ISO for more image grain)
f/1.4 @ 120mm (same aperture but with a smaller focal length, you can already see the DoF is getting bigger)
f/1.4 @ 55mm (again same aperture but with a very small focal length, you can now see that there's almost nothing out of focus and you have a very wide viewing angle)
The latest feature I implemented in LipsEngine is a method to reduce and almost completely remove aliasing caused by high frequency normal maps and limited pixel shader evaluation frequency. As you know the pixel shader works on rasterized fragments, which are about the same amount as the screen resolution. If you now imagine these "sample points" as a grid and how they would "move" over the shaded image you can clearly see how this will without a doubt lead to aliasing. This issue is further increased by evaluation of a wide range of pixel intensity values, which are very common in physically based shading.
A high frequency normal map like the one seen below for example can create extreme cases of aliasing when evaluating your lighting 's BRDF with a very low roughness value:
This has been pretty much the case with every game that came out since forever. Recently there have been some exceptions. To mention a few recent ones that successfully tackled this issue: Ryse (Crytek) and The Order 1886 (ReadyAtDawn).
There are several ways to go about it. A nice article can be found here: http://blog.selfshadow.com/2011/07/22/specular-showdown/.
Basically what all of these technique try to do is find a way to evaluate the BRDF with a higher frequency than your regular pixel shader does. You could for example go the brute force way and implement some kind of shader supersampling or texture space shading to obtain a "ground truth" result. However this is of course unfeasible to do in real time so we need to look for other ways...
A more interesting approach is Toksvig AA (a technique presented by Michael Toksvig in 2004) or the one presented by the guys from ReadyAtDawn (which is also the one I've implemented in my engine), that try to find a new "corrected" roughness value by evaluating an "effective BRDF" which properly accounts for the variance of all normal map texels covered by a pixel. We can then compute this value from every mip level of the normal map and can then either store this new roughness value in a texture or compute it in real time during the material shading pass.
I absolutely believe anti-aliasing is crucial to blending realtime rendering and CG. And now that there are efficient and fast methods to these issues, we should make sure to embrace them in future games.
There's a quote by Stephen Hill that I like:
Aliasing is one key differentiator between us real-time folk and the offline guys. Whilst they can afford to throw more samples at the problem, heavy supersampling would be too slow for us. MSAA is also ineffective as it only handles edge aliasing, so under-sampling artefacts within shading will remain - specular shimmering being a prime example. Post-process AA isn’t really helpful either since it does nothing to address sharp highlights popping in and out of existence as the camera or objects move. Finally, you might think that temporal AA could be a solution, but that’s really just a poor man’s supersampling across frames, with added reliance on temporal coherency.
And at last comparison shots and videos:
This time I wanna talk about my recently implemented physically based camera setup.
I stumbled upon this topic while researching for Depth of Field Post-Processing techniques.
You can find some really nice write-ups here:
Basically what it comes down to is trying to simulate how a camera behaves (including its problems and constrictions) in the real world. I'm still a little uncertain as to how much sense this makes as it can be quite difficult to wrap your head around using it if you're not a photographer/cinematographer (or someone proficient in using digital cameras).
There's also some breathing room as to how far you want to go when it comes to simulating a real life camera because of the restrictions that it brings with it.
Let me give you a quick rundown...
There are 3 settings in a digital camera that control the final image's exposure (aka how bright the image is):
- Shutter Speed
- ISO Speed
Aperture is the opening on your camera that describes how much light (stops of light) is allowed to enter.
It is measured in F-Stops and describes the ratio of the focal length (f) and the diameter of the lens opening (D):
N = f / D
Furthermore it directly affects something that is called the Depth of Field. The Depth of Field is the area in your image that is in focus. The rest of the picture that is not in focus becomes increasingly blurry and shows something called Bokeh. Imagine a small dot on the sensor where the lens is focusing all the light onto. If you were to increase the distance between the sensor and the lens (the focal distance) then this small dot of focused light would become larger and spread outwards. This small dot is what is otherwise known as the Circle of Confusion (CoC).
So to come back to rendering, the aperture size would be an important variable to simulating Depth of Field and Bokeh.
Shutter Speed is the time that the shutter as it is called, is open to let light in. It is measured in seconds (e.g. 2sec, 1sec, 1/200sec).
In rendering this would correspond to motion blur and the blur strength. The longer the shutter is open the stronger our motion blur. However as we can't properly simulate this function as an integration of accumulated light over time in computer graphics we just have to approximate this in the form of a motion blur strength in the range of [0,1]:
1 sec = 1.0 (max motion blur).
1/1000 sec = 0.0 (no motion blur)
ISO Speed is the speed or sensitivity of the sensor/film to light. A higher ISO allows for a more exposed (brighter) image but introduces noise.
These 3 variables together control the image's final exposure. If you increase one of them, you have to decrease one of the others. For example you can't expect to obtain a well exposed image when using a small aperture (high F-number, F/22), a quick shutter speed (1/2000) and a low ISO (100).
In that case you'd have three choices to make:
- Decrease the shutter speed so that more light comes in (over time)
- Increase the aperture size so that more light comes in
- Increase the ISO speed however this will create a noisy image
Another variable that is important to our physically based camera system would be the focal length (measured in mm).
This would correspond to a zoom lens in real life.
In my engine I've made this to be in the common range of 18mm - 210mm.
It's also quite easy to convert this directly to its corresponding Field of View (Fov) using the following functions:
// Focal length (mm) to FoV (degrees)
inline double FocalLengthToFOV(const double focalLength)
const double filmSize = 43.266615300557; // common film size
return (2 * std::atan(filmSize / (2 * focalLength))) * 180 / PI;
// FoV (degrees) to Focal length (mm)
inline double FOVToFocalLength(const double FoV)
const double filmSize = 43.266615300557; // common film size
return (filmSize / (2 * std::tan(PI * FoV / 360)));
The relationship between aperture and focal length is something I'm still working on...so don't be alarmed if the result isn't exactly what you'd expect from a real camera.
Here are some screenshots showing the resulting image using different camera settings: